-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is it possible to derive the list of currently running actors from the performance database? #273
Comments
You're not the first to ask 😄 The direct answer is no. The profiling subsystem measures things that libmuscle does, and computing things isn't part of that, so whether something is running needs to be inferred from the fact that it isn't doing anything else. Mostly, that would be waiting to receive a message. That's recorded as an event, but the record is only complete once we actually receive the message, so the manager doesn't know about this while it's going on. Also, records are saved up for a while and sent in batches in the background, to reduce the performance impact, so that delays things further. I think what we want to have is some kind of monitoring system, but monitoring isn't the same as profiling. The latter collects exhaustive data for analysis after the fact, while the former aims to give the user a real-time look at what's going on. Since users aren't very fast compared to CPUs, monitoring can sample and skip or summarise some data in particular when things are going very quickly. We do have a remote logging system, through which log messages are sent from the instances to the manager at least if Of course, this would still be limited to monitoring things that libmuscle does. It would probably be nice to have effectively a kind of That's all doable, but a fair bit of work. We should probably investigate whether there are existing performance monitoring tools that can do this, and if we can just integrate better with them. I'm not aware of there being many open source options in this space though, and existing tools may not work so well with complex coupled simulations, so it may still be worth it. |
Thanks for the explanation. I think as a starting point a csv file that is appended to every <user_selected_time> seconds with an entry for each actor with a character indicating waiting, sending, receiving, running might be good. |
Oh, that's a nice idea. Then you can |
See also #171. |
Is it possible to derive the list of currently running actors from the performance database?
This could be useful if there are longer running actors ...
If so, it would be good to also know when it started running this time (or the run time so far), and, perhaps, the average of previous run times.
The text was updated successfully, but these errors were encountered: