Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you please let me know if only changing the following line is enough to run the code in GPU. Environment.getInstance().setExecutionMode(EXECUTION_MODE.SEQ); to Environment.getInstance().setExecutionMode(EXECUTION_MODE.GPU); #36

Open
sudiptap opened this issue Mar 18, 2015 · 4 comments

Comments

@sudiptap
Copy link

I have the required environment set up. In fact by just changing that I am not able to cut down the running time of my application. Thank you.

@sudiptap
Copy link
Author

I have the required environment set up. In fact by just changing that I am not able to cut down the running time of my application. Thank you.

@thesilencelies
Copy link

You may not get any runtime improvement even if it does move it to the GPU - it really depends on how big the networks you're working with are...

@joelself
Copy link

I've tried GPU, CPU, and SEQ for MINST testLeNetSmall, testSigmoidHiddenBP and XorTest testCNNMLPBP and none of them give a speed-up for GPU. In fact, sometimes CPU is faster, but never GPU. Can anyone give an example of a network that would benefit from being run on the GPU?

@joelself
Copy link

I got the GPU run faster, but only by making huge networks that would take forever to complete for example I modified the testLenetSmall function to have this network:

NeuralNetworkImpl nn = NNFactory.convNN(new int[][] { { 28, 28, 1 }, { 5, 5, 120, 1 }, { 2, 2 }, { 5, 5, 120, 1 }, { 2, 2 },  { 3, 3, 120, 1 }, { 2, 2 }, {2048}, {2048}, {10} }, true);

Basically I added a 3rd convolutional net, bumped up the number of filters in in all covnets to 120 (from 20 and 50), quadrupled the neurons in the final hidden layer and added another hidden layer with 2048 neurons. The GPU enabled version runs about 2.4 times faster, but it's still dog slow taking something like 12 - 14 seconds per batch (the batch size is 1) so training the entire dataset of 60000 images would take 8.3 to 9.7 days. So like 10 days per epoch on the GPU. Meanwhile I built a comparable network in Lasagne/Theano and it takes around 420 seconds per epoch on the CPU (in a VM at that) which is about 2000 times faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants