-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot launch more than 65 environments #70
Comments
Hi @Holt59 One possibility is that for some reason the port is taken for that environment. We start with port 5005 (worker_id=0) and increment from there. I would suggest trying different worker ids. If that doesn't seem to be the issue, another thing to try would be to add a wait time between launching the environment. We've gotten reports that when launching too many Unity processes concurrently errors like this can occur. |
@Holt59 you may be running out of GPU memory. I've only been able to run 2x16 locally (16 per gpu one is 1080 with 8gb other is a 1060 with 6gb). In the large scale curiosity paper they stated they where only able to get 40 unity environments running (I can't remember if it was a 4 or 8 gpu) Also, I use a sec delay between launching each unity instance |
@awjuliani I've already checked the port, I'll try to add a delay between launch. @Sohojoe I've a 12G K80 and I am only starting environment, no extra algorithms. And as I said, the GPU memory consumption ( |
@Holt59 - did you get around this? I found that some ports are in use on my PC and so did a hardcoded hack to skip them |
@Sohojoe — I did not solve this but I did not look that much into it because I faced other ones... I checked the ports on my computer, and I had nothing running on these, so I don't think that was the issue. |
I tried to launch 100 environments I got a
UnityTimeoutException
when creating the 66th one. I checked multiple times and the exception always occurs on the 66th instantiation.I am using gcloud with a K80 GPU and the memory usage is less than the available memory.
The text was updated successfully, but these errors were encountered: