Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add max_threads_per_process and mp_context to pca by channel computation and PCA metrics #3434

Merged
merged 6 commits into from
Sep 27, 2024

Conversation

alejoe91
Copy link
Member

@alejoe91 alejoe91 commented Sep 23, 2024

No description provided.

@alejoe91 alejoe91 added postprocessing Related to postprocessing module concurrency Related to parallel processing labels Sep 23, 2024
@zm711
Copy link
Collaborator

zm711 commented Sep 23, 2024

After tests pass I can test this in lab. PCA is so brutal on Windows! Hopefully setting this also gets us a speed boost too (although could just be a Windows problem).

@alejoe91
Copy link
Member Author

Since I'm at it, I'll propagate the same trick to the PCA metrics.

For PCA computation, I'm getting 5-6x speed up on linux!

@alejoe91 alejoe91 changed the title Add max_threads_per_process to pca fit_by_channel Add max_threads_per_process and mp_context to pca by channel computation and PCA metrics Sep 23, 2024
@alejoe91 alejoe91 marked this pull request as ready for review September 24, 2024 08:20
@alejoe91
Copy link
Member Author

@zm711 let me know if this speeds things up on Windows! @jonahpearl can you also give it a try to see if it fixes the PCA hanging?

@zm711
Copy link
Collaborator

zm711 commented Sep 24, 2024

In a seminar this morning, I'll test this on an old dataset this afternoon :)

@zm711
Copy link
Collaborator

zm711 commented Sep 24, 2024

So testing now it seems like we don't get a speed up at all on Windows. What version of sklearn do you have? Basically no matter what I set for n_jobs I still use all processors at 100% activity. I tried varying the max_threads_per_process between 2-4 and also saw no difference in calculation time. So I think this is either an OS issue (Windows is just slow for this) or maybe a env issue.

ADDITION: since I never had the hanging issue I don't know if this fixes that though!

@alejoe91
Copy link
Member Author

Mmm interesting! Have you tried max_threads to 1?

@zm711
Copy link
Collaborator

zm711 commented Sep 24, 2024

Mmm interesting! Have you tried max_threads to 1?

Nope, but I can try first thing tomorrow morning when I'm back at the Windows station.

@samuelgarcia
Copy link
Member

OK for me.

@zm711
Copy link
Collaborator

zm711 commented Sep 25, 2024

Same speed for 1 thread. And still uses all cores to the max. I think this must be a Windows scheduler thing. It doesn't make sense to bounce processes around on different processors, but it must be happening.

@alejoe91 alejoe91 added this to the 0.101.2 milestone Sep 27, 2024
@alejoe91 alejoe91 merged commit 3df1083 into SpikeInterface:main Sep 27, 2024
15 checks passed
@jonahpearl
Copy link
Contributor

No luck for me but I do have an update which i'll put in the issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
concurrency Related to parallel processing postprocessing Related to postprocessing module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants