On normalization-free model's performance #11
ankur56
started this conversation in
Show and tell
Replies: 1 comment
-
No problem, discussions are always welcomed. It should work since the dependencies are only pytorch and nothing else. Since lightning integrates perfectly, I don't see why it shouldn't work. Do let me know if you face any usage issues. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I have my own custom 3D-Densenet model, which I am using within the Pytorch-Lightning framework. I was wondering if I can make a normalization-free version of my model using your code. As far as I can tell, I just need to change the "base.py" file to make it work for 3D-convolutions.
Edit: I converted my model to a normalization-free version and tested it. I made a couple of observations on the NF-model’s performance and behavior, and I would be grateful if anyone could shed light on them.
The NF-model takes almost twice as much time for a single epoch as the regular model for the same batch size. The paper states that batch normalization is quite a computationally demanding operation; however, the means and deviations of the weights are being computed in the “WSConv2d” layer as well. Is this the reason why the “WSConv2d” layer is more expensive than a regular “Conv2D” layer? I am not sure why the normalization-free model is more expensive than my regular model.
The convergence of the NF-model is excruciatingly slow and depends heavily on the choice of the clipping factor (lambda). In my case, a value of 0.01 for the clipping factor made the convergence extremely slow, so I changed it to 0.8 (or 1.0), which made the convergence relatively faster but still significantly slower than the regular model. I also tried changing the learning rate and batch size but couldn’t make the convergence faster. I am using the SGD optimizer with Nesterov momentum. Is there any other way to make the convergence faster?
Beta Was this translation helpful? Give feedback.
All reactions