-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apex not supporting CUDA 11.0? [Help me] #988
Comments
The latest PyTorch binaries can be installed with CUDA11.0 as shown in the install instructions. Note that mixed-precision training is available in PyTorch directly via In case you have trouble building apex, you could use a PyTorch NGC container with CUDA11.1, where PyTorch and apex are installed. |
@ptrblck CUDA 11.0 supports MIG. Is this feature available on PyTorch? or any tips? I met following error
|
MIG is not PyTorch-specific and can be enabled on your A100. The error shows that you are using a PyTorch build, which doesn't support the necessary compute capability for your A100 ( |
Thanks, I use pip3 to install. I will switch another method. |
How can I tell
Also would it be possible to make apex builds on conda-forge for cuda11.0 and cuda11.1? Thank you! |
@stas00 you can try to use |
Awesome!
but no luck building it:
|
I don't see the error message besides that it's failing and don't know if the right CUDA version was found now. |
There is no option to do that, so I had to hack
So I successfully built apex against system-wide cuda-11.1, while having pytorch w/ cuda-11.0 installed, Yay! And it works just fine! Thank you, @ptrblck! |
@ptrblck When I install the apex toolkit ,I met some problems below:
And I have searched the problem on some search egine, But got no anwser.
Thank you! |
I'm pretty sure you need cuda-11.1 for that - I built Once you have cuda-11.1 installed, follow the notes in #988 (comment) |
Awesome! |
I added a proper solution here: #997 |
Hello, I met the same problem with you. Can you tell me how you solve the problem? Thanks a lot! |
@stas00 hi,i havs same problem. |
|
@stas00 Successfully installed apex-0.1,Thank you! |
I was using this trick then install apex success, but I get into this error: |
Hi @stas00 , I used your branch but still get the error "nvcc fatal: unsupported gpu architecture 'compute_86'" :( |
@empty-id, make sure you have cuda-11.1 or higher installed and configured correctly - please see: https://huggingface.co/transformers/master/main_classes/trainer.html#possible-problem-2 |
Now installed with cuda-11.1, but I met the following problem when I run a pytorch code with apex... @stas00
|
Looks like the same error as reported here pytorch/pytorch#47669 (comment) which apparently has been fixed in pytorch many months back. Try pytorch-1.9.0 and if it doesn't work please file a new issue. In general use google to search for similar errors, this is how I got the above url. |
@stas00 Thank you for your reply! I finally make it work now. I find your hack is not necessary. Just use torch-1.9.0-cuda11.1 to install NVIDIA/apex latest github repo is OK with cuda11.1 system-wide. |
Can you explain it in more detail? Create a diff file, copy the code above, and run it. |
My nvcc version is cuda 11.0, but I found the pytorch latest version from this website is 10.2
As a result I can't properly install apex.
ImportError: cannot import name 'amp'
Software Versions pre-installed:
i followed this commands:
when normally import apex, it is working.
but in the main program, not working.
not importing apex module.
Please help me to solve this issue @definitelynotmcarilli @thorjohnsen @mcarilli @kexinyu @ptrblck :)
The text was updated successfully, but these errors were encountered: