Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU implementation of hamming distance #541

Open
wants to merge 118 commits into
base: main
Choose a base branch
from

Conversation

felixpetschko
Copy link
Collaborator

Hamming distance implementation with numba.cuda for GPU support.
This is built on top of the changes in Hamming distance implementation with Numba #512

felixpetschko and others added 30 commits April 29, 2024 13:28
@grst
Copy link
Collaborator

grst commented Nov 4, 2024

Hi @felixpetschko, what's the status here? Do you need anything from myself or Severin?

I've seen you switched to Cupy, could you elaborate how that compares to the numba implementation?

@felixpetschko
Copy link
Collaborator Author

Hi @grst! I am mainly done with my implementation here. Currently the speedup on my laptop for 1 million cells for the ir_dist function with hamming is at around 10 (45 vs. 480 seconds) compared to the new fast numba CPU implementation (and probably >100 compared to the original CPU implemenation). I think this is also the maximum speedup I would aim at for now, because there are currently some sequential parts (1 cpu) in the ir_dist function besides the hamming GPU kernel and the upstream processing for reading and preparing the data takes already longer anyway. So further optimization of the hamming kernel wouldn't be very effective.

My plan would be to prepare a pull request that is ready for review over the next days.

The reasons for switching to CuPy were the following:
Numba cuda only provides limited cuda features and you never really know how the numba code is mapped to the cuda features internally. However CuPy allows you to write the cuda kernels directly in C++ and offers all cuda features (that i have seen so far). Also most online ressources about GPU programming are about cuda kernels written in C++ and numba cuda is very niche. Also other programmers that might look at the code in the future will probably only know C++ cuda kernels (if they already did GPU programming). When doing GPU programming I use profiling tools to find out what is actually going on at the hardware and what the compiler did, so having this additional abstraction level with numba can actually be a nightmare.

@grst
Copy link
Collaborator

grst commented Nov 5, 2024

Makes sense, thanks!
Still not sure if this should be merged in to scirpy or into rapids-singlecell then. @Intron7, I'll also bring that up at the next core dev meeting what's our long-term strategy here.

@Intron7
Copy link
Member

Intron7 commented Nov 8, 2024

Hey @felixpetschko can you send me a larger dataset to test this? I have some ideas and want to see if this works.

@grst
Copy link
Collaborator

grst commented Nov 16, 2024

Still not sure if this should be merged in to scirpy or into rapids-singlecell then. @Intron7, I'll also bring that up at the next core dev meeting what's our long-term strategy here.

@felixpetschko, the outcome of this discussion was that the function stays here, and we'll setup a GPU CI for scirpy. @ilan-gold or @flying-sheep can help with that once this PR is ready.

@grst grst changed the title Hamming distance implementation with numba.cuda (GPU) GPU implementation of hamming distance Nov 16, 2024
Copy link

codecov bot commented Dec 5, 2024

Codecov Report

Attention: Patch coverage is 53.71622% with 137 lines in your changes missing coverage. Please review.

Project coverage is 79.44%. Comparing base (08e0cc3) to head (8315e2f).
Report is 22 commits behind head on main.

Files with missing lines Patch % Lines
src/scirpy/ir_dist/metrics.py 4.44% 129 Missing ⚠️
src/scirpy/ir_dist/_clonotype_neighbors.py 95.52% 6 Missing ⚠️
src/scirpy/ir_dist/__init__.py 50.00% 1 Missing ⚠️
src/scirpy/ir_dist/_util.py 95.83% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #541      +/-   ##
==========================================
- Coverage   81.43%   79.44%   -2.00%     
==========================================
  Files          49       50       +1     
  Lines        4213     4525     +312     
==========================================
+ Hits         3431     3595     +164     
- Misses        782      930     +148     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Intron7 Intron7 added the run-gpu-ci runs GPU CI label Dec 5, 2024
@grst
Copy link
Collaborator

grst commented Dec 10, 2024

@Intron7, Appears phil created documentation for the gpu ci: https://github.com/scverse/governance/blob/main/developer/gpu_ci.md

Maybe this helps?

@Intron7
Copy link
Member

Intron7 commented Dec 10, 2024

@grst I talked to @Zethson and he adjusted the setting in cirun. The test is now running but failing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-gpu-ci runs GPU CI
Projects
Status: In progress
Development

Successfully merging this pull request may close these issues.

3 participants