Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VK_ERROR_OUT_OF_HOST_MEMORY on windows profiling vulkan #15

Open
farnoy opened this issue Jun 1, 2018 · 9 comments
Open

VK_ERROR_OUT_OF_HOST_MEMORY on windows profiling vulkan #15

farnoy opened this issue Jun 1, 2018 · 9 comments

Comments

@farnoy
Copy link

farnoy commented Jun 1, 2018

Hi, I have this problem that's unique to Windows 10 x64, where after enabling profiling in Developer Panel, my app cannot start up and fails with VK_ERROR_OUT_OF_HOST_MEMORY. Without profiling, everything runs fine. The error occurs in vkCreateDevice. I think the problem might be with queues I want created. If I request only one graphics queue, vkCreateDevice succeeds, but if I request 1 graphics, 3 compute, 1 transfer, then I get the error.

I am only enabling VK_KHR_swapchain, it doesn't matter if I don't select anything in VkPhysicalDeviceFeatures. I have tested with and without validation layers, my application is requesting vulkan 1.1, although, IIRC I also had this problem with 1.0. I have stopped overlays like RTSS, Steam, renderdoc, but nothing helps.

On Linux, everything runs fine (I just need to point VK_ICD_FILENAMES at AMDVLK, because I use RADV by default), but I have not checked if the driver advertises different set of queue families and capacities, my code is sort of flexible in this regard.

Configuration:

  • Windows 10 Enterprise N x64 1803, OS build 17134.48
  • RX Vega 64
  • Threadripper 1950X
  • LunarG SDK 1.1.73
  • RGP 1.2.0.21 (on Windows, some other 1.2 release on Linux)
  • GPU Driver 18.5.2

RDP log:

[RDP] Received client connected from unknown client with id 6800.
[RDP] Received client halted from unknown client with id 6800.
[RDP] Processing halted client with id 6800: v4.exe:11144 - AMD Vulkan Driver
[RDP] Updated v4.exe ClientId to 6800
[RDP] Connected DriverControlClient to process 'v4.exe', ProcessId = 11144
[RDP] Filtered halted process with ProcessId = 11144
[RDP] Enabled profiling for target executable 'v4.exe', ProcessId = 11144.
[RDP] Set profiling flag for ProcessId = 11144 to true.
[RDP] Capture profile button is enabled because the target application is profilable and there is no profile in progress.
[RDP] Found 0 settings.
[RDP] Resumed execution of process 'v4.exe', ProcessId = 11144. Disconnect client.
[RDP] Wait for driver initialization in process 'v4.exe' failed.
[RDP] Attempted to disconnect from DriverControlClient that was already disconnected.
[RDP] Client with Id 6800 has disconnected.
[RDP] Capture profile button has been disabled because the application is not profilable.
@ahosier
Copy link
Contributor

ahosier commented Aug 31, 2018

Hi, Thanks for the feedback. Could you provide a very simple test app that duplicates the issue you're seeing?

Also, try updating your driver to the latest 18.8.2 and get the latest RGP 1.3. There have been a number of changes that may fix this issue.

Thanks,
Tony.

@farnoy
Copy link
Author

farnoy commented Aug 31, 2018

Hey, I re-tested on 18.8.2 with RGP 1.3 and I get the same issue. As soon as I enable profiling in RDP, my app will crash on vkDeviceCreate. When I disable profiling, it starts working correctly again.

Using the new DebugUtils extension, I get this in my logs: [ Loader Message ] ERROR & GENERAL => terminator_CreateDevice: Failed in ICD C:\WINDOWS\System32\DriverStore\FileRepository\c0332601.inf_amd64_5beeaaa0c940e99c\B332635\.\amdvlk64.dll vkCreateDevicecall

I think this may be related to a similar issue I had in baldurk/renderdoc#1078, when replaying a capture would also crash with the same exact error.

As for reproducing, maybe you could try opening the capture I uploaded in that issue? I just checked and can still reproduce that error. Interestingly, when I reboot my machine and replay that capture without doing anything else, it works on first try. When I close and re-open the same capture, it starts failing with that error.

As for reproducing this on my app, I could send you the source code, but it does not use a common visual studio build and so it might be a hassle for you. If you have a secure sandbox, I could send you the binary, maybe that would be easier?

I don't know if something is wrong with my VulkanSDK/radeon driver installation. I tried reinstalling both, starting RDP with an explicit VK_LAYER_PATH, nothing seems to make a difference here (and it did for that RenderDoc issue).

@ahosier
Copy link
Contributor

ahosier commented Sep 4, 2018

Hi Farnoy,

After re-reading your initial post and talking to others, we're actively working on a fix for RGP capture on multiple compute queues, which will be fixed in new driver release shortly. If you would still like to share your application, I've added a dropbox share here: https://www.dropbox.com/home/RGP-farnoy. We can test with the internal driver we have and let you know if it fixes the issue you're seeing.

Thanks,
Tony.

@farnoy
Copy link
Author

farnoy commented Sep 5, 2018

Hey @ahosier,

I have started using compute queues recently, so this would fit what you're saying. I uploaded Linux and Windows x64 binaries and an asset that it uses. The working app should render a static imgui window and a bunch of helmets, you can press G to disable camera movement.

My app is hardcoded to request all compute queues available, and always submits to 3 of them in the Windows version, 4 in the Linux one. This is how many I have on RX Vega 64 and I didn't bother making this dynamic yet. Let me know if that's a problem for you.

Please keep me updated when a fix arrives!

Thank you,
Jakub

@natevm
Copy link

natevm commented Mar 13, 2019

I seem to have a similar error on my RX 560 on windows:

validation layer: terminator_CreateDevice: Failed in ICD C:\windows\System32\DriverStore\FileRepository\u0339878.inf_amd64_c30429afa55bc85b\B339766.\amdvlk64.dll vkCreateDevicecall
validation layer: vkCreateDevice: Failed to create device chain.
RuntimeError: vk::PhysicalDevice::createDevice: ErrorOutOfHostMemory

This only occurs while I'm running the Radeon Developer Panel. If I close the radeon developer panel, I can create use vkCreateDevice. Other tools like RenderDoc also crash when "settings > core > Enable Radeon GPU Profiler Integration" is enabled.

My application only uses one graphics/present queue, and no compute queues.

I seem to be able to start Sascha's examples with the profiler attached, but I'm unable to capture any profiles...

@gselley
Copy link
Contributor

gselley commented Mar 13, 2019

Hi, please make sure you are using the latest AMD driver (19.3.1), and that you do NOT have another GPU in your system. If you are running Intel Integrated graphics, please disable it in the device manager. Currently, RGP only reliably supports the presence of a single GPU (2 AMD GPU's can cause issues too BTW - its not a vendor thing).

@natevm
Copy link

natevm commented Mar 17, 2019

@gselley Disabling my integrated graphics seems to prevent GLFW from initializing a window for some unknown reason. Might be a limitation of my e-GPU/laptop development environment.

Why is RPG only reliable in the presence of a single GPU? That makes developing on a laptop very difficult. Seems like something other frame capture tools have been able to account for, although perhaps more detailed profiling makes this a more difficult feat.

@jgavert
Copy link

jgavert commented Jul 1, 2019

Hello, is there update on this?

I'm also running to this problem on both Vulkan and D3D12 since I got RTX2070 in addition to RX480.
It would be nice if this was fixed as It's really convenient to keep multiple cards in one computer instead of multiple computers. Also easy to just choose which card to use using the interfaces given by the API's.

@chesik-amd
Copy link
Contributor

There are still Known Issues related to profiling applications on systems with more than one GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants