Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

So, what's the best model in your experiment? #10

Open
pmixer opened this issue Mar 15, 2018 · 11 comments
Open

So, what's the best model in your experiment? #10

pmixer opened this issue Mar 15, 2018 · 11 comments

Comments

@pmixer
Copy link

pmixer commented Mar 15, 2018

Moreover, We observe that the tracking performance of saved models in different epochs varies considerably, therefore, you may want to evaluate a few more models instead of just picking the model in the last epoch.

I tested the code, trained and evaluated the tracker and the performance on OTB 100 using the last checkpoint only reached 0.527 for overlap AUC. Could u pls tell which checkpoint did you used for the reported 0.58+ performance? Thx in advance @bilylee

@bilylee
Copy link
Owner

bilylee commented Mar 16, 2018

0.527 is way below the expectation. It should be 0.55~0.58 in my experience. I evaluated all epochs and reported the best result, which is within the last 30 epochs.

How about the performance of the pretrained model? is it the same as reported? Did you follow exactly the same steps in the training section?

@pmixer
Copy link
Author

pmixer commented Mar 18, 2018

@bilylee Thx for the response, I did follow the given steps to train the model, the problem may be caused by preprocessing VID procedure on our server, could u please share the size of the training set folder in your experiment to help me figure out it? I only got one folder of ~20GB size for training set but the matlab version seems giving out a much larger training set after preprocessing(30+GB). Thx in advance and sorry for the delayed reply?
(Ps, 博士现在是在亚研实习么?)

@pmixer
Copy link
Author

pmixer commented Mar 19, 2018

ps, 0.527 is the model generated by python experiments/SiamFC-3s-color-scratch.py

@bilylee
Copy link
Owner

bilylee commented Mar 20, 2018

Hi, you can check the size of the training data by the following script:

import os.path as osp
import pickle
with open('data/train_imdb.pickle') as f:
    imdb = pickle.load(f)

n_frames = 0
for v in imdb['videos']:
    n_frames += len(v)

total_file_size = 0 # Byte
for v in imdb['videos']:
    for p in v:
        total_file_size += osp.getsize(p)
total_file_size /= (1024 * 1024 * 1024) # GB

print('Num of videos: {}'.format(imdb['n_videos'])) # 8309
print('Num of frames: {}'.format(n_frames)) # 1877463
print('Total file size: {} GB'.format(total_file_size)) # 19 GB

It is normal to have a smaller curated dataset (~20G) than the one obtained via the Matlab version (~53G). The reason is that I have removed all *.z.jpg since these exemplar images can be extracted on-the-fly from corresponding *.x.jpg.

Did you check the tracking performance of the pretrained model? Did you accidentally evaluated the model on tb50?

我之前在亚研实习,但已经 check out 了 : )

@pmixer
Copy link
Author

pmixer commented Mar 26, 2018

感谢博士,数据是 ok 的,还在排查是不是 from scratch train 的过程有问题或者训练 epoch 数不够,使用的默认参数~ps,果然(不然SA-SiamFC那篇不可能在 release 出来之前用上这个实现:cry:)

@Angel-Jia
Copy link

@PeterHuang2015 求问下如何使用训练好的模型?我用命令

python experiments/SiamFC-3s-color-scratch.py

已经训练完毕,但是执行

python scripts/run_tracking.py

的时候报错。说找不到文件:

Logs/SiamFC/track_model_checkpoints/SiamFC-3s-color-pretrained

@pmixer
Copy link
Author

pmixer commented Apr 23, 2018

@Mabinogiysk ?说找不到文件就是路径有问题啊。。。绝对路径比较稳:red_car:抱歉回复比较迟

@pmixer
Copy link
Author

pmixer commented May 4, 2018

Thx to @yangkang, got 0.579 on otb100 which is close the reported performance from @bilylee , run 65 epochs, which means 432250 iterations as there are 53200 training pair and each iteration process 8 pairs which are called a batch.

To those who care this issue.BTW, the performance was about 0.56 after 70000 iterations but 0.53 after 50 epochs...The devil may live in this hyperparameters like 0.176 for windows influence and scaling factor, strongly advise focusing on these parts rather than simply introduce in a new convnet.

@pmixer
Copy link
Author

pmixer commented May 8, 2018

Hi @bilylee do you know this paper: https://arxiv.org/abs/1802.08817? It claims that using lr=0.1 to train for 25 epochs and lr=0.01 for 5 epochs would generate 0.584 AUC score based on your code. I run 25 epochs (with lr=0.1 lr_decay=1), and modified experiment script to change epoch num to 30 and lr=0.01 but did not get the reported result on OTB100, may you help to check whether I'm setting the lr in the right way or not?(BTW, tried modifying the network model, removed grouping operation and enlarged the conv5 layer to 512 channels, trained 30 epochs, got about 0.55 AUC on OTB100:cry:, sad, it's really difficult to determine which checkpoint should I use, feeling like being in a Casinos)

@bilylee
Copy link
Owner

bilylee commented May 23, 2018

Hi,

This paper uses vanilla SGD optimizer without momentum.

I typically just evaluate all the epochs and choose the best one. Even though the performance varies in different epochs, the best performance seems to be stable.

The SiamFC is already over-fitting during the second half of training epochs. I guess it is normal to have worse performance with larger neural network.

@pmixer
Copy link
Author

pmixer commented May 25, 2018

@bilylee Yes, it reported using vanilla SGD optimizer, but the interesting point is that when I requested code, the author sent me the link to this project which caused me thought that you involved in this work:cry:, and repeatedly asking we the performance is not consistent with the reported results, sorry for that:trollface:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants