Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass the memory handler and filter config to exclude cuda transport #130

Merged
merged 1 commit into from
Nov 16, 2023

Conversation

tvegas1
Copy link
Collaborator

@tvegas1 tvegas1 commented Oct 27, 2023

Due to parallel cuda kernel running, UCX plugin must not try to use cuda primitives to manipulate passed cuda memory. To do so we:

  1. Disable cuda transport alias altogether
  2. Don't try to resolve the cuda memory chunk to the whole cuda memory area
  3. Force RNDV to avoid the need for explicit memory copy (eager, rx path)
  4. Pass memory handle to allow memory details to be looked-up instead

@tvegas1
Copy link
Collaborator Author

tvegas1 commented Oct 27, 2023

pls look @brminich, @yosefe

src/ucx_plugin.c Outdated Show resolved Hide resolved
Copy link
Collaborator

@brminich brminich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but can we make it configurable just in case? I.e. have a single NCCL var disabling or enabling cuda support in UCX plugin

@tvegas1
Copy link
Collaborator Author

tvegas1 commented Oct 27, 2023

lgtm, but can we make it configurable just in case? I.e. have a single NCCL var disabling or enabling cuda support in UCX plugin

as discussed, added complete bypass config possibility to prevent blocking any further testing down the line.

@tvegas1 tvegas1 force-pushed the dev/no_cuda branch 2 times, most recently from 977ec32 to f1c3976 Compare November 1, 2023 17:00
@tvegas1 tvegas1 force-pushed the dev/no_cuda branch 2 times, most recently from ee3f61f to bb79eda Compare November 14, 2023 15:48
src/ucx_plugin.c Show resolved Hide resolved
@bureddy bureddy merged commit a0432ad into Mellanox:master Nov 16, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants