-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"KeyError: ..." when running "muscle3 profile --instances ..." (version 0.7.1) #274
Comments
It seems that there's no Maybe we could have the launcher record when processes are started and when they stop, cleanly or through a crash. Then unless the manager itself crashes (I'm sure you'll find a way 😃), the data should be there. |
The run finished cleanly: Simulation completed successfully. These are the messages from init in the log file: (muscle3_venv) <g2dpc@s53 ~/GIT/ets_paf>grep ^init run_ETS_MAIN_ENCAPSULATED_WITH_ETS_INIT_20231027_094921/muscle3_manager.log |
I think I've reproduced this on another model I'm working on, that similarly has a component that initialises a state, dispatches it to the rest of the simulation and then quits. That should make it easier to figure out what's going on. This model is getting closer to being done, after which I should have a bit more time to work on MUSCLE3 again... |
I've found the issue: the problem is that if you don't have any incoming ports, then you don't have to wait for any predecessors to tell you whether they will send another message or not. So a component like your There's a symmetrical situation with components that have no outgoing ports, which don't need to do a So now the question is what to do. Do we make these events optional and only record them if they really happened? Or do we always record them because we're still moving through this stage of the shutdown even if there is no work to do? At least it should be consistent, and for the former the profiler needs to be updated to not crash if the events are missing. I think it's nicer if they're always there, that makes the database format less flexible and thus makes it easier to write analysis code. And the profiler should be able to deal with missing events anyway, because there are now old files that don't have them, and because an instance can crash and not generate this event.
|
Released with 0.7.2. Please give it a try, and could you close this issue if it works for you? |
When attempting to analyse a completed muscle3 run using
muscle3 profile --instances $RUN/performance.sqlite
I get
Traceback (most recent call last):
File "/gss_efgw_work/scratch/g2dpc/GIT/ets_paf/UQ/muscle3_venv/bin/muscle3", line 8, in
sys.exit(muscle3())
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/muscle3/muscle3.py", line 70, in profile
plot_instances(Path(performance_file))
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/muscle3/profiling.py", line 26, in plot_instances
instances, compute, transfer, wait = db.instance_stats()
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/libmuscle/manager/profile_database.py", line 135, in instance_stats
total_times = [(stop_run[i] - start_run[i]) * 1e-9 for i in instances]
File "/gw/switm/muscle3/0.7.1/intel/2020/lib/python3.10/site-packages/libmuscle/manager/profile_database.py", line 135, in
total_times = [(stop_run[i] - start_run[i]) * 1e-9 for i in instances]
KeyError: 'init'
where 'init' is the name of one of the actors (the first to be executed). I have hacked profile_database.py
(first and last lines are from the original code). This at least allows the code to run, and I get the message:
init not in stop_run
What I've done is a horrible hack, but I haven't been able to find why 'init' is not in stop_run ...
I'm not sure if anybody else has experienced this problem
The text was updated successfully, but these errors were encountered: