Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement merge shuffle algorithm for Numba #991

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

zengraf
Copy link
Collaborator

@zengraf zengraf commented Jan 17, 2023

This is an implementation of a parallel shuffling algorithm called Merge Sort and is a follow-up to #964.

The algorithm

  1. It splits the input in halves until the slices have at most 0x100000 (1048576) elements
  2. For each slice it runs in parallel the original Fisher-Yates algorithm, which has been moved to a separate function
  3. It runs in parallel the merge operation from the paper

Please, note that merging step requires n * Math.ceil(log2(n / 1048576)) more random numbers

@slayoo
Copy link
Member

slayoo commented Jan 18, 2023

Thank you, @zengraf!
CC: @piotrbartman

I suggest to:

  • make it an option to use MargeShuffle (Collision dynamic __init__ method seems a good place to set it)
  • raise a NotImplementedError() in the GPU backend if this option is chosen
  • add a test which checks how it works with and without the new option set

@slayoo
Copy link
Member

slayoo commented Jan 25, 2023

While looking at the changes today with @abulenok (trying to understand the reasons for the breakup test failures), we came up with the idea to create a unit test aimed at checking if the generated permutations have indeed a uniform distribution - would be a great addition!

@abulenok
Copy link
Collaborator

Hey @piotrbartman,
Originally in shuffle_local, we selected a random element from the whole array in each iteration.
https://github.com/atmos-cloud-sim-uj/PySDM/blob/ffea070f312b5e5491e988c97d3ee7924c9f80ea/PySDM/backends/impl_numba/methods/index_methods.py#L29
Should we select a random element from the i + 1 - start subarray, similar to how it is done in shuffle_global ?https://github.com/atmos-cloud-sim-uj/PySDM/blob/ffea070f312b5e5491e988c97d3ee7924c9f80ea/PySDM/backends/impl_numba/methods/index_methods.py#L21

@piotrbartman
Copy link
Collaborator

Yes, definitely. The current implementation of shuffle_local seems to be buggy and we don't have a uniform distribution there.

@slayoo
Copy link
Member

slayoo commented Jan 29, 2023

@piotrbartman Thanks for following up!
@abulenok Thanks for spotting it and clarifying the issue here!
@zengraf Thanks for the new implementation - without it, the old bug would not be spotted!
I suggest to first add a new unit test that will fail for non-uniform permutation distributions.

@github-actions
Copy link

Stale pull request message

@abulenok abulenok force-pushed the feature/merge-shuffle-numba branch from bc0ba70 to 8e94031 Compare May 16, 2023 21:33
@github-actions
Copy link

Stale pull request message

@github-actions
Copy link

Stale pull request message

Copy link

github-actions bot commented Dec 5, 2023

Stale pull request message

Copy link

github-actions bot commented Feb 6, 2024

Stale pull request message

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants