-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve parallelism of ir_dist #473
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #473 +/- ##
==========================================
- Coverage 80.52% 80.50% -0.02%
==========================================
Files 49 49
Lines 3994 4012 +18
==========================================
+ Hits 3216 3230 +14
- Misses 778 782 +4 ☔ View full report in Codecov by Sentry. |
* move block_size param to `calc_dist_mat` function * dynamically select block size based on problem size * Use tqdm + joblib.Parallel + joblib.delayed
In general this works well (also out-of-machine computing with dask), but:
|
the reason was that the worker threads all import scirpy from scratch. It turned out to be quite slow because of |
@felixpetschko fyi In this PR I change how the But I think your reimplementation of the hamming distance calculator anyway doesn't need this. |
Closes #468