-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help! training not starting! #urgent #124
Comments
I got this error now: InternalError Traceback (most recent call last) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\tensorflow\python\client\session.py in _run_fn(feed_dict, fetch_list, target_list, options, run_metadata) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\tensorflow\python\client\session.py in _call_tf_sessionrun(self, options, feed_dict, fetch_list, target_list, run_metadata) InternalError: Blas GEMM launch failed : a.shape=(30, 64), b.shape=(64, 10), m=30, n=10, k=64 During handling of the above exception, another exception occurred: InternalError Traceback (most recent call last) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\keras\engine\training.py in train_on_batch(self, x, y, sample_weight, class_weight) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\keras\backend\tensorflow_backend.py in call(self, inputs) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) C:\ProgramData\anaconda3\envs\airsim\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args) InternalError: Blas GEMM launch failed : a.shape=(30, 64), b.shape=(64, 10), m=30, n=10, k=64 Caused by op 'dense2/MatMul', defined at: InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(30, 64), b.shape=(64, 10), m=30, n=10, k=64 |
@mitchellspryn please help |
depencies.txt |
I am not at MSFT currently, so I am not actively supporting this repo any more. That said, I took a look at your stack trace. It looks like CUDA isn't installed properly. Relevant portion:
I'd check to see if you can run any keras training operation - e.g. try training a linear model on some random data points and see if the forward/backpropagation works properly. My guess is no, and that'll help you debug what the situation is with your cuda install. |
Thank you for answering. I have tried to reinstall to check if it's something to do with cuda. I also tried by installing the cudatoolkit and cudann before install tensorflow by following these steps: Still I have the same problem. Do you have any idea how I can solve this? I have really tried to look it up, but it seems many had the same problem, but no solutions that worked for me. As I am using this as a part of my master thesis, I have limited time as well. |
My training is not starting. I have used python 3.6 with tensorflow gpu 1.8.0 and keras 2.1.2. Also I have a Geforce GTX 3060 running on my computer. So it shouldnt be a problem. I also installed Norton antivirus on this new computer. On the older computer which has a bad GPU I had Panda Dome, but there training was running. But after over 1 hour, the training was only on 1%. Thats why I bought a new computer with a good GPU and CPU. Some of this work is going to be presented in my master thesis. I would appreciate any help soon.
The text was updated successfully, but these errors were encountered: