Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproduce MSDNet paper's results #10

Open
stevelaskaridis opened this issue Feb 5, 2019 · 1 comment
Open

Reproduce MSDNet paper's results #10

stevelaskaridis opened this issue Feb 5, 2019 · 1 comment

Comments

@stevelaskaridis
Copy link

Hi @gaohuang,

I have been trying lately to reproduce the results from the MSDNet paper, but I could not get the pretrained model's accuracy. I would very much appreciate if you could check if the following setup (for k=4) is correct:

th main.lua -dataset imagenet -data <imagenet_dir> -gen gen -nGPU 4 -nBlocks5 -base 7 -step 4 -batchSize 256 -nEpochs 90

vs.

th main.lua -dataset imagenet -data <imagenet_dir> -gen gen -nGPU 4 -nBlocks5 -base 7 -step 4 -batchSize 256 -retrain msdnet--step\=4--block\=5 --growthRate\=16.t7 -testOnly true 

The discrepancy in accuracy ranges from -1% (1st classifier) to -6% (last classifier) in top-1.

According to your paper:

On ImageNet, we use MSDNets with four scales, and the ith classifier operates on the (k×i+3)th layer (with i=1, . . . , 5 ), where k=4, 6 and 7. For simplicity, the losses of all the classifiers are weighted equally during training.

[...]

We apply the same optimization scheme to the ImageNet dataset, except that we increase the mini-batch size to 256, and all the models are trained for 90 epochs with learning rate drops after 30 and 60 epochs.

Am I missing something in the training parameters for Imagenet?

@stevelaskaridis
Copy link
Author

Hi @gaohuang,

I was wondering if you have had any ideas on what could have gone wrong in the reproduction of the paper's results.
Is there a way to double-check the hyper-parameters of the distributed pretrained model? Could this be a consequence of training on multiple GPUs?

I would greatly appreciate your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant