Use parallel processing #140

kjmeagher · 2023-03-01T22:27:04Z

This changes the server to send a list of several p-frames to the clients.
The clients process the frames in parallel using python's multiprocessing
module. The tables used by photospline are shared between the processes.
The number of parallel trays can be controlled with the new option --parallel-trays

This changes the server to send a list of several p-frames to the clients. The clients process the frames in parallel using python's multiprocessing module. The tables used by photospline are shared between the processes. The number of parallel trays can be controlled with the new option --parallel-trays

ric-evans

What's the benefit of parallelizing within a client, in terms of empirical performance? Is the main lag in the icetray IO or computational? I've assumed it's computational. With the ewms-pilot framework, if a single one of these "grouped" pixels caused an exception or the pixel "group" lands on a bad worker node, then the whole "group" will need to be auto-re-enqueued instead of a single pixel.

dsschult · 2023-03-01T22:47:34Z

This should work. I'm just not terribly happy with doing multiple pixels per message, as it goes somewhat against the long-term vision for EWMS. I'd be happier with the ewms-pilot processing multiple messages in parallel. (at least, as long as the message processing time was >> 1 second, so it's actually compute bound)

I assume it's using shared memory for the tables, and has the potential problem of boost::interprocess screwing up the shared memory lock? Is there any protection on that not creating a black hole node?

ric-evans · 2023-03-01T22:49:07Z

I think parallelizing the client would be better accomplished by the ewms-pilot. Here, each client would create multiple processes each with an icetray for an arbitrary number of messages. This is currently in the planning stage.

dsschult · 2023-03-01T22:49:27Z

Added @briedel as he asked for this.

ric-evans · 2023-03-01T22:53:07Z

Worth pointing out: packing too much into a single message will eventually throttle the broker

Similarly, this could slow down the overall performance since we're waiting for all the pixels in a group to finish before sending the pixelrecos

briedel · 2023-03-01T23:57:17Z

We are requesting 5 GB of RAM per pixel right now. This is really inefficient if we want to run this on the grid or expensive if we run this in the cloud and running photospline is shared memory mode is really finicky on the client side. I rather have this inside icetray to keep the implementation details away from the pilot beyond “batching” events together somehow.

kjmeagher · 2023-03-02T15:48:53Z

Based on the specs I got from @briedel, I identified three possible ways to accomplish parallel processing:

Have skymap_scanner.client allocate the tables then call reco_icetray.reco_pixel() directly with subprocess
Have skymap_scanner.client use IPC to pass multiple pframe packets to reco_icetray
The solution in this PR

I went with 3 because it seemed like the easiest to implement. I don't know what ewms-pilot is, so I didn't consider it when choosing a solution. I would need more information to asses which one is best.

Also note that using subprocess for multiple trays is exactly the same as how triggered corsika works and so far we haven't had any issues with shared memory locks

Also note that the current main branch calls recos.get_reco_interface_object(reco_algo) twice which results in allocating the spline tables twice

dsschult

Giving a proper review of the changes, I think the code generally looks fine. I'm not sure if we want the default number of subprocesses to be 8 or 1 though.

dsschult · 2023-03-02T16:15:03Z

skymap_scanner/client/reco_icetray.py

+
+    # loop over the pframes to run in parallel
+    for i, pframe in enumerate(pframes):
+        outpath = Path(f"out{i:04}.pkl")


Could this use a more unique path, either using tempfile or maybe with the parent process pid? I'm thinking of the possibility of running two instances side by side, which could happen in EWMS.

ric-evans · 2023-03-02T16:19:33Z

Also note that the current main branch calls recos.get_reco_interface_object(reco_algo) twice which results in allocating the spline tables twice

@kjmeagher thanks for finding this. This should be optimized to one allocation, definitely

ric-evans · 2023-03-02T17:40:49Z

@kjmeagher we talked this over and I think we've found a way to do this parallelization that's also friendly with message passing. If you wrap https://github.com/icecube/skymap_scanner/blob/main/skymap_scanner/client/client.py#L83-L96 with multiple async-loop calls, then the ewms-pilot will make a subprocess for each pilot. Each pilot processes a stream of messages (each with a single pixel & icetray). Note, you will need to give unique names to each pilot's in/out files, see https://github.com/icecube/skymap_scanner/blob/main/skymap_scanner/client/client.py#L73-L74.

The --parallel-trays CL arg would be given to the client instead of the server.

This solves the slow-pixel problem by removing the grouped-pixels constraint. It doesn't solve the bad-node problem or the boost's potential shared-memory lock problem, but hopefully, the tradeoff is worth it.

So, start_scan.py & reco_icetray.py can be unchanged from main. Though I do like your get_reco_interface_object-related optimizations.

kjmeagher · 2023-03-02T19:00:02Z

@ric-evans how does your plan propose to share memory? Will skymap_scanner.clinet allocate the memory? If so, wouldn't be easier to call subprocess.Process to spawn the trays?

ric-evans · 2023-03-02T19:27:24Z

@ric-evans how does your plan propose to share memory? Will skymap_scanner.clinet allocate the memory? If so, wouldn't be easier to call subprocess.Process to spawn the trays?

Right, this problem is specific to a particular reco, and we want to keep the scanner client generalized to support other styles of recos in the future. I think this is better suited to be solved when the tray is initiated, rather than in client.py. Are there shared memory tools within the photonics service we can use? Something that uses /dev/shm?

tianluyuan · 2023-03-02T19:59:26Z

Also note that the current main branch calls recos.get_reco_interface_object(reco_algo) twice which results in allocating the spline tables twice

@kjmeagher thanks for finding this. This should be optimized to one allocation, definitely

Yes, good catch. The cascade_service can be moved inside the traysegment I think and that should fix it.

tianluyuan · 2023-03-02T21:05:07Z

@ric-evans how does your plan propose to share memory? Will skymap_scanner.clinet allocate the memory? If so, wouldn't be easier to call subprocess.Process to spawn the trays?

Right, this problem is specific to a particular reco, and we want to keep the scanner client generalized to support other styles of recos in the future. I think this is better suited to be solved when the tray is initiated, rather than in client.py. Are there shared memory tools within the photonics service we can use? Something that uses /dev/shm?

I think it already is using /dev/shm and iirc I thought we were able to get that working with the containers. @briedel are jobs still hanging for you on the cloud?

briedel · 2023-03-02T22:21:12Z

They are not hanging anymore, but this with special settings for /dev/shm and that things aren't shared between containers in a pod and the underlying node, i.e. we are turning the sharing off for all intents and purposes. I am not sure what will happen once we don't have that. As long as it is inside a container it should be fine, but if it is inside a pod (multiple containers) I am not sure.

kjmeagher · 2023-03-02T22:52:37Z

Right, this problem is specific to a particular reco, and we want to keep the scanner client generalized to support other styles of recos in the future. I think this is better suited to be solved when the tray is initiated, rather than in client.py. Are there shared memory tools within the photonics service we can use? Something that uses /dev/shm?

I don't know much about the internals of photospline so maybe @cnweaver or @jvansanten could say if there is a way to access the same instance in photospline in multiple different processes

cnweaver · 2023-03-02T23:19:39Z

if there is a way to access the same instance in photospline in multiple different processes

Yes, that was the entire point of the photospline v2 rewrite of the evaluation interface. I3PhotoSplineService should do this automatically, which is what was causing the hangs in containers, which are now fixed only by configuring the containers to have not-actually-shared shared memory. This is something that containers pretty much are designed to make difficult, so if people insist on using them, I don't think there's anything we can do/improve from the photospline/photonics side (in terms of enabling sharing; there are some race conditions which are exceedingly difficult to fix and can also cause hangs.)

ric-evans · 2023-03-02T23:30:26Z

if there is a way to access the same instance in photospline in multiple different processes

Yes, that was the entire point of the photospline v2 rewrite of the evaluation interface. I3PhotoSplineService should do this automatically, which is what was causing the hangs in containers, which are now fixed only by configuring the containers to have not-actually-shared shared memory. This is something that containers pretty much are designed to make difficult, so if people insist on using them, I don't think there's anything we can do/improve from the photospline/photonics side (in terms of enabling sharing; there are some race conditions which are exceedingly difficult to fix and can also cause hangs.)

Thanks, @cnweaver. So, as long as we're running our multiple processes within the same container, will photospline take care of shared memory between these processes?

cnweaver · 2023-03-02T23:32:32Z

I think that should work, providing the container-isolated shared memory is turned on. However, I don't think anyone has yet tried it.

tianluyuan · 2023-03-03T00:01:55Z

They are not hanging anymore, but this with special settings for /dev/shm and that things aren't shared between containers in a pod and the underlying node, i.e. we are turning the sharing off for all intents and purposes. I am not sure what will happen once we don't have that. As long as it is inside a container it should be fine, but if it is inside a pod (multiple containers) I am not sure.

Does python multiprocessing somehow get around this containerization? My impression was that it wouldn't, but I don't have a sense of how pods work on the cloud.

ric-evans · 2023-05-11T20:43:28Z

Using ewms-pilot's multitasking will enable parallelism while retaining the paradigm of loosely-coupled pixels, see #194

kjmeagher requested review from tianluyuan and ric-evans March 1, 2023 22:27

ric-evans reviewed Mar 1, 2023

View reviewed changes

ric-evans requested a review from dsschult March 1, 2023 22:46

dsschult requested a review from briedel March 1, 2023 22:49

dsschult reviewed Mar 2, 2023

View reviewed changes

tianluyuan mentioned this pull request Mar 2, 2023

move photonics service object inside traysegment so that staticmethod… #141

Closed

tianluyuan mentioned this pull request May 11, 2023

Decouple spline staging from class declaration #192

Merged

ric-evans closed this May 11, 2023

ric-evans deleted the parallel_trays branch September 13, 2023 14:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use parallel processing #140

Use parallel processing #140

kjmeagher commented Mar 1, 2023

ric-evans left a comment

dsschult commented Mar 1, 2023 •

edited

Loading

ric-evans commented Mar 1, 2023 •

edited

Loading

dsschult commented Mar 1, 2023

ric-evans commented Mar 1, 2023 •

edited

Loading

briedel commented Mar 1, 2023

kjmeagher commented Mar 2, 2023

dsschult left a comment

dsschult Mar 2, 2023

ric-evans commented Mar 2, 2023

ric-evans commented Mar 2, 2023 •

edited

Loading

kjmeagher commented Mar 2, 2023

ric-evans commented Mar 2, 2023

tianluyuan commented Mar 2, 2023 •

edited

Loading

tianluyuan commented Mar 2, 2023

briedel commented Mar 2, 2023

kjmeagher commented Mar 2, 2023

cnweaver commented Mar 2, 2023 •

edited

Loading

ric-evans commented Mar 2, 2023

cnweaver commented Mar 2, 2023

tianluyuan commented Mar 3, 2023

ric-evans commented May 11, 2023

Use parallel processing #140

Use parallel processing #140

Conversation

kjmeagher commented Mar 1, 2023

ric-evans left a comment

Choose a reason for hiding this comment

dsschult commented Mar 1, 2023 • edited Loading

ric-evans commented Mar 1, 2023 • edited Loading

dsschult commented Mar 1, 2023

ric-evans commented Mar 1, 2023 • edited Loading

briedel commented Mar 1, 2023

kjmeagher commented Mar 2, 2023

dsschult left a comment

Choose a reason for hiding this comment

dsschult Mar 2, 2023

Choose a reason for hiding this comment

ric-evans commented Mar 2, 2023

ric-evans commented Mar 2, 2023 • edited Loading

kjmeagher commented Mar 2, 2023

ric-evans commented Mar 2, 2023

tianluyuan commented Mar 2, 2023 • edited Loading

tianluyuan commented Mar 2, 2023

briedel commented Mar 2, 2023

kjmeagher commented Mar 2, 2023

cnweaver commented Mar 2, 2023 • edited Loading

ric-evans commented Mar 2, 2023

cnweaver commented Mar 2, 2023

tianluyuan commented Mar 3, 2023

ric-evans commented May 11, 2023

dsschult commented Mar 1, 2023 •

edited

Loading

ric-evans commented Mar 1, 2023 •

edited

Loading

ric-evans commented Mar 1, 2023 •

edited

Loading

ric-evans commented Mar 2, 2023 •

edited

Loading

tianluyuan commented Mar 2, 2023 •

edited

Loading

cnweaver commented Mar 2, 2023 •

edited

Loading