Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output format #3

Open
bkj opened this issue Aug 29, 2017 · 1 comment
Open

Output format #3

bkj opened this issue Aug 29, 2017 · 1 comment

Comments

@bkj
Copy link

bkj commented Aug 29, 2017

I ran the code, and just want to be clear that I'm understanding the output format.

$ python train.py --which-dataset C10
$ python evaluate.py --SMASH=SMASH_D12_K4_N8_Nmax64_maxbneck2_SMASH_C10_seed0_100epochs --which-dataset C10
$ python train.py --SMASH SMASH_D12_K4_N8_Nmax64_maxbneck2_SMASH_C10_seed0_100epochs --which-dataset C10
$ tail -n 4 logs/SMASH_Main_SMASH_D12_K4_N8_Nmax64_maxbneck2_SMASH_C10_seed0_100epochs_Rank0_C10_seed0_100epochs_log.jsonl
{"epoch": 98, "train_loss": 0.001096960324814265, "_stamp": 1504033169.525098, "train_err": 0.015555555555555555}
{"epoch": 98, "val_loss": 0.2705254354281351, "_stamp": 1504033174.813815, "val_err": 5.84}
{"epoch": 99, "train_loss": 0.0011473518449727433, "_stamp": 1504033324.084391, "train_err": 0.011111111111111112}
{"epoch": 99, "val_loss": 0.2725760878948495, "_stamp": 1504033329.318958, "val_err": 5.8}

I figure the 5.8 in the last line indicate that I've wound up w/ a trained model that gets 5.8% error on CIFAR-10 -- is that right? Which number does the 5.8 correspond to in Table 1 in the paper -- SmashV1=5.53 or SmashV2=4.03 or something else? I'm in the process of working through the code, but to double check that I understand the inputs/outputs.

Thanks
Ben

@ajbrock
Copy link
Owner

ajbrock commented Aug 29, 2017

The 5.8 there looks like error on the validation split (so training on 45,000 images, testing on 5,000 from the train set). If you want to use the CIFAR-10 test set, use the validate-test command line arg. You'll also see in stdout messages which indicate which split is being used, how many params the model has, and a couple other debuggy details.

All the nets in here are SMASHv2, and they have options for lots of variability in how the archs are defined (variable op structure, variable filter sizes, where BN goes, etc.) although at present the defaults do not correspond to the numbers in the paper (which a quick param count comparison should make clear). I'll be uploading the pre-trained models from the paper soon.

Aside, I'm currently working on writing up the documentation (been traveling the last two weeks), which should hopefully help with making this whole shebang more grokkable. Feel free to ask more questions--this code is awfully complicated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants