You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I figure the 5.8 in the last line indicate that I've wound up w/ a trained model that gets 5.8% error on CIFAR-10 -- is that right? Which number does the 5.8 correspond to in Table 1 in the paper -- SmashV1=5.53 or SmashV2=4.03 or something else? I'm in the process of working through the code, but to double check that I understand the inputs/outputs.
Thanks
Ben
The text was updated successfully, but these errors were encountered:
The 5.8 there looks like error on the validation split (so training on 45,000 images, testing on 5,000 from the train set). If you want to use the CIFAR-10 test set, use the validate-test command line arg. You'll also see in stdout messages which indicate which split is being used, how many params the model has, and a couple other debuggy details.
All the nets in here are SMASHv2, and they have options for lots of variability in how the archs are defined (variable op structure, variable filter sizes, where BN goes, etc.) although at present the defaults do not correspond to the numbers in the paper (which a quick param count comparison should make clear). I'll be uploading the pre-trained models from the paper soon.
Aside, I'm currently working on writing up the documentation (been traveling the last two weeks), which should hopefully help with making this whole shebang more grokkable. Feel free to ask more questions--this code is awfully complicated.
I ran the code, and just want to be clear that I'm understanding the output format.
I figure the
5.8
in the last line indicate that I've wound up w/ a trained model that gets 5.8% error on CIFAR-10 -- is that right? Which number does the5.8
correspond to in Table 1 in the paper --SmashV1=5.53
orSmashV2=4.03
or something else? I'm in the process of working through the code, but to double check that I understand the inputs/outputs.Thanks
Ben
The text was updated successfully, but these errors were encountered: