Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new imglib2 parallelization approach #83

Open
wants to merge 21 commits into
base: master
Choose a base branch
from
Open

Use new imglib2 parallelization approach #83

wants to merge 21 commits into from

Conversation

maarzt
Copy link
Contributor

@maarzt maarzt commented Sep 19, 2019

In this PR I change imglib2-algorithm to use the parallelization framework introduced in imglib/imglib2#269! I want to find out it the parallelization framework is useful. And that's the result:

Yes! 🎉 All multi-threading in imglib2-algorithm can be replaced by the parallelization framework.

Most multi-threading code can be replaced with multi-threaded LoopBuilder or IterableLoopBuilder. This also leads to extreme simplification of the code.
IterableLoopBuilder is part of this PR, it's like LoopBuilder but has an IterableInterval as first argument.

Some methods, that where previously single threaded, now run multi-threaded. It's still possible to execute them single threaded, use Parallelization.runSingleThreaded( () -> ... ). We might either just stick with this. Or give multi-threaded functions a different name. And provide a backward function.
These affected methods are:

ConnectedComponents.labelAllConnectedComponents( final RandomAccessible< T > input, final ImgLabeling< L, I > labeling, final Iterator< L > labelGenerator, final StructuringElement se )
ConnectedComponents.labelAllConnectedComponents(final RandomAccessible< T > input, final RandomAccessibleInterval< L > output, final StructuringElement se )
TensorEigenValues.calculateEigenValuesSymmetric(final RandomAccessibleInterval< T > tensor, final RandomAccessibleInterval< U > eigenvalues )
TensorEigenValues.calculateEigenValuesSquare(final RandomAccessibleInterval< T > tensor,	final RandomAccessibleInterval< U > eigenvalues )
TensorEigenValues.calculateEigenValues(final RandomAccessibleInterval< T > tensor, final RandomAccessibleInterval< U > eigenvalues, final EigenValues< T, U > ev )
LocalExtrema.findLocalExtrema(final RandomAccessibleInterval< T > source, final LocalNeighborhoodCheck< P, T > localNeighborhoodCheck )
LocalExterma.findLocalExtrema(final RandomAccessibleInterval< T > source, final LocalNeighborhoodCheck< P, T > localNeighborhoodCheck, final Shape shape )
LocalExtrema.findLocalExtrema(final RandomAccessible< T > source,final Interval interval,final LocalNeighborhoodCheck< P, T > localNeighborhoodCheck )
LocalExtrema.findLocalExtrema(final RandomAccessible< T > source,final Interval interval,final LocalNeighborhoodCheck< P, T > localNeighborhoodCheck,final Shape shape )
SubpixelLocalization.refinePeaks(final List< P > peaks, final RandomAccessible< T > img, final Interval validInterval, final boolean returnInvalidPeaks,final int maxNumMoves, final boolean allowMaximaTolerance, final float maximaTolerance, final boolean[] allowedToMoveInDim )
DistanceTransform.transform(...) (6 diffenrent versions)
DistanceTransform.binaryTransform(...) (5 different versions)

The work isn't finished yet: classes Erosion, Dilation, Opening, Closing, TopHat, BlackTopHat could have methods without the numThread parameter.

Overall this PR shows that imglib2-algorithm can be greatly simplified using the new parallelization framework. It reduces code duplication, because for many algorithms there is a single-threaded and multi-threaded version, which is no longer needed.

@@ -34,6 +34,7 @@
/**
* Set the {@link ExecutorService} to be used for convolution.
*/
@Deprecated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without having studied the code too closely: does this work? I.e., if some existing downstream code is using setExecutor, won't that executor now be ignored, instead of used when executing? Or did I miss where this is handled in some descendant class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method now works again.

@tpietzsch
Copy link
Member

I didn't look at it in detail yet, but I think the IterableLoopBuilder belongs in core and should be a separate PR

@imagejan
Copy link
Member

imagejan commented Nov 4, 2020

What's the state of this PR? Can we have IterableLoopBuilder in imglib2 core?

@maarzt
Copy link
Contributor Author

maarzt commented Nov 10, 2021

This PR requires imglib/imglib2#310 (IterableLoopBuilder) to be merged an released first.

@maarzt
Copy link
Contributor Author

maarzt commented Nov 10, 2021

The imglib2 Parallelization context + multithreaded LoopBuilder allow to conveniently implement multi-threading into an imglib2 algorithms. (As shown in this PR) There's no need to have ExecutorService or numTasks parameter. The method signatures remain clean. But this raises an important question? How to deal with existing single threaded methods, should we make the multi-threaded?

Possible answers:

  1. Make these methods multi-threaded. They can still be executed single-threaded by using Parallelization.runSingleThreaded(...). Consequence: Exiting code will be multi-threaded, and by default run faster. But also some existing code might run slower because of the multi-threading overhead or if too much multi-threading happens in parallel. Additionally some code with thread-unsafe RandomAccesses might break.
  2. Keep the existing methods single-threaded. Add multi-threaded alternatives. Consequence: Nothing breaks. Performance stay constant. But also no speed up for existing code. People might not use the multi-threaded alternatives, because they are not aware of them.

@tpietzsch @axtimwalde @StephanPreibisch What are your thoughts?

@tpietzsch
Copy link
Member

Also discussed gitter: https://gitter.im/imglib/imglib2?at=618bd07238377967f4a42c73

copy&paste from there:

Curtis Rueden @ctrueden Nov 10 16:10
@maarzt Awesome, thanks for all your work on this!
For what to do with existing algorithms: I think it depends on whether multithreading them would change the result.
If you arrive at the same deterministic answer as before, then I vote to definitely switch them over to multithreading.
If the result changes, but is still deterministic, you could add new method sig(s) and deprecate the old one(s).
If the result of multithreading it becomes non-deterministic, that's a problem for Ops and probably others using the algorithm, so we should be careful there.

Jean-Yves Tinevez @tinevez Nov 10 16:51
Awesome !

Tobias Pietzsch @tpietzsch Nov 10 17:36
I would also change to multi-threading by default. (or rather using current Parallelization context by default). I would prehaps keep the single-threaded versions under an explicit name (e.g. PartialDerivative.gradientForwardDifferenceSingleThreaded) for things that are possibly run on tiny things where even querying the Parallelization ThreadLocal could be considered too much overhead. These "kernel methods" are perhaps anyway used as building blocks for the multi-threaded versions, and I don't see a big downside to exposing them.

Tobias Pietzsch @tpietzsch Nov 10 17:41
But multi-threaded by default is good
@axtimwalde @StephanPreibisch opinions?

@axtimwalde
Copy link
Member

Auto-multi-threading can be bad when parallelizing things with Spark. The Parallelization context will have unintended consequences when several tasks run in the same JVM (which is typical). Am I overthinking this?

@tpietzsch
Copy link
Member

Auto-multi-threading can be bad when parallelizing things with Spark. The Parallelization context will have unintended consequences when several tasks run in the same JVM (which is typical). Am I overthinking this?

Maybe I'm "underthinking" it, but wouldn't it be sufficient to then just use Parallelization.runSingleThreaded(...)around each task?

@maarzt
Copy link
Contributor Author

maarzt commented Nov 12, 2021

Hi @axtimwalde, I hope the parallelization context will be useful also for your code that runs on Spark. It is actually intended for those use cases. Where a part of the code should run single threaded. And no worries it's thread safe. It uses ThreadLocal to store one configuration per thread.

As Tobi suggested, all you need to do is to wrap the tasks that you execute one Spark into a call of Parallelization.runSingleThreaded(...). I hope it will for you. We otherwise need to fix the Parallelization context. I'm happy to look at any example where it breaks, or where you think it would break.

@axtimwalde
Copy link
Member

May be I am not understanding this well enough. However, Parallelization.runSingleThreaded(...) calls TaskExecutors.singleThreaded(...) which returns a singleton executor for the entire JVM. That would mean that all parallel Spark tasks running on the same JVM now run sequentially instead of in parallel. This would not be right. It looks like Spark's and Parallelization's task management are trying to do the same thing in incompatible ways. But again, may be I am misinterpreting something here, haven't spent enough time, sorry.

@maarzt
Copy link
Contributor Author

maarzt commented Nov 12, 2021

However, Parallelization.runSingleThreaded(...) calls TaskExecutors.singleThreaded(...) which returns a singleton executor for the entire JVM.

Yes that is correct. TaskExecutors.singleThreaded() returns a singleton executor. This executor is the only instance of the class SequentialTaskExecutor. But this won't cause any trouble. The class SequentialTaskExecutor has no fields and therefor no internal state or state changes. One could have multiple instances of SequentialTaskExecutor, it wouldn't make no difference. They would all behave exactly the same at any time... I wanted to make the overhead of calling TaskExecutors.singleThreaded() as small as possible. Returning an singleton is faster than creating new instances.

By the way, the same is true for TaskExecutors.multiThreaded() and the respective ForkJoinExecutorService. There is always only one instance, no fields, no internal state, no need to create multiple instances.

That would mean that all parallel Spark tasks running on the same JVM now run sequentially instead of in parallel.

No. The SequentialTaskExecutor isn't that powerful. Let us assume Spark is configured to start 8 threads in a JVM. Calling Parallelization.runSingleThreaded(...) won't change anything about this. All it does is preventing the imglib2 algorithms from spawning new threads. And I assume that's exactly what you need.

@ctrueden
Copy link
Member

@maarzt @tpietzsch @axtimwalde Is this PR still relevant? It would be great to get this library using the unified multithreading approach!

@axtimwalde
Copy link
Member

I think this is still relevant and looks good if running tasks SequentialTaskExecutor.getInstance() from multiple threads will not force-sequentialize them.

@maarzt
Copy link
Contributor Author

maarzt commented Nov 17, 2022

I will go over this PR, check everything, and prepare it to be merged.

maarzt added 11 commits December 6, 2022 15:30
This simplifies the existing implementation while staying backwards
compatible. The code adds a new method:

 Thresholder.threshold( source, threshold, above )

That is multi-threaded according to the Parallelization context.
…lib2 Parallelization context

Add new methods DifferenceOfGaussian.DoG that don't require an
ExecutorService but use the Parallelization context.

Replace the old implementation with multi-threaded LoopBuilder.
No new methods are added. Existing methods behave the same, but can be
configured using the Parallelization context.
@maarzt
Copy link
Contributor Author

maarzt commented Dec 7, 2022

Ok, I added tests and updated the javadoc. I would say, this PR is ready to be merged!

@maarzt maarzt requested a review from ctrueden December 7, 2022 13:42
@maarzt maarzt marked this pull request as ready for review March 10, 2023 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants