-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restartable models #111
base: main
Are you sure you want to change the base?
Restartable models #111
Conversation
@afedynitch You said that you are looking into fixing cross-sections for the models. I added a new test |
From my side, this PR is ready for review, I need your help to fix the remaining things. Tests are failing largely because of the new tests for rng_state persistence that I added (that is unrelated to my changes). |
At least one of the tests is stalling, on my own computer and on CI. I haven't figured out which one yet. This means you have to use a keyboard interrupt to complete the tests at the moment. On Windows the tests abort with a memory error, these things could be related, although I did not notice a large increase in RAM on my computer. Parallel computation works differently on Windows and Unix, so it could well be that there is a mistake which Windows reveals more drastically. Edit: Stalling is now fixed, the issue is well understood. |
The test which runs forever is test_to_hepmc3.py, I am investigating. |
return CrossSectionData(total=stot, elastic=sela, inelastic=stot - sela) | ||
|
||
assert kin.p1.is_hadron | ||
assert kin.p2.is_hadron |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allowed targets are _projectiles, but in reality it's only Nuclei()
, proton and neutron.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asserts are just an executable comment, I put this here to remind myself that we only deal with hadrons at this point in the function. The nuclear targets are handled by the part above.
src/impy/models/dpmjetIII.py
Outdated
|
||
# TODO set more cross-sections | ||
|
||
def _set_kinematics(self, kin): | ||
# Save maximal mass that has been initialized | ||
# (DPMJET sometimes crashes if higher mass requested than initialized) | ||
|
||
# Relax momentum and energy conservation checks at very high energies | ||
if kin.ecm > 5e4: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't do that without setting it back. Like this the output will depend on the sequence of requested energies since energy-momentum conservation checks can reject events. Either piece has to stay in init or, one needs to save the default values and reset them if kinematics change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It cannot be in init.
Ah, I see what you mean, but it does not really matter, I think. We need to use MCRunRemote which always uses a fresh copy of dpmjet when _set_kinematics is called, so it is always called only once per dpmjet instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to fix this a bit, but you are the DPMJet expert, so I am relying on your help.
@@ -160,9 +158,9 @@ def _set_kinematics(self, kin): | |||
self._lib.dt_init( | |||
-1, | |||
max(kin.plab, 100.0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the maximal initialization energy cannot be exceeded. So, it has to be saved as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dt_init can be only called once, right? At least that was my impression from observing the code. MCRunRemote makes it so that _set_kinematics is called exactly once per instance.
|
||
super().__init__(seed) | ||
def _once(self, ecm_max=1000 * TeV): | ||
from impy import debug_level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this imply that the debug level cannot change in runtime? It is important if you search for something that goes wrong at event 12345 but don't want to see the output of 12344 events.
Another reason to keep it changeable in runtime is that different debug levels in the models can produce extremely different levels of output.
I don't think that the functionality was there before consistently. So might be no problem, when using this "remote run".
N-th edit: If I change the debug level in impy.debug_level in a notebook or python script, and do a second initialization, it won't be reflected here. Something like this would be great.
impy.debug_level = 0
m1 = Epos() # do something
impy.debug_level = 5
m2 = Phojet() # do something different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another reason to keep it changeable in runtime is that different debug levels in the models can produce extremely different levels of output.
I agree that being able to change debug levels for a model instance is good, but the API we have so far (also before my changes) does not allow this.
If you pass the debug level to fortran in a function that is called only at init, then you cannot change the debug level after init. However, we should be able to access the variable that is used in the fortran code to enable debug output, and then you can change it at any time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this:
impy.debug_level = 0
m1 = Epos() # do something
impy.debug_level = 5
m2 = Phojet() # do something different
I thought this works, this is why debug_level
is imported freshly from top level impy
(and to avoid some circular import issue). Did you test it and found it does not work? It does not work, if you call Epos again instead of Phojet, but it should work if it is two different models, and both are created for the first time.
if self._lib.pho_event(-1, *self._beams)[1]: | ||
raise RuntimeError( | ||
"initialization failed with the current event kinematics" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this is wrong. pho_event(-1, ....) has to be called once for the highest energy requested for a run. Different combinations of particles and energies are set with pho_setpar
. Once an event generation is triggered (pho_event(>0, ..)
), PHOJET runs the remaining initializations. So pho_event(-1
only once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we call it once with a maximum energy that is never exceeded? like 1000 * TeV?
|
||
def _cross_section(self, kin=None): | ||
def _cross_section(self, kin): | ||
if (kin.p2.A or 1) > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really that bad at nuclei :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, perhaps you have to query the cross-section from another variable. At some point I dereference a nullptr in the calls when nuclei are used. I didn't check what exactly happens.
def __init__(self, evt_kin, *, seed=None): | ||
super().__init__(seed) | ||
def _once(self): | ||
from impy import debug_level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, no change in run time is an unnecessary design limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I did not design the library in this way, it was already like that.
_version = "2.1" | ||
_library_name = "_sib21" | ||
|
||
|
||
class Sibyll23(SIBYLLRun): | ||
# For some reason, Sibyll23 requires MCRunRemote, but the others don't |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that only Sibyll23 crashes on multiple inits? In SIBYLL, the initialization should not be called multiple times even if it works. If I remove the check in PHOJET it will also work but will result in undefined or altered behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you can try it out. Replace MCRunRemote with MCRun further below as base class for Sibyll23 and run the tests. I don't understand why, so it is good if someone else checks this.
An attempt to close #106
This branch is based on #112, which should be merged first.
This is using two approaches. For most models, it is sufficient run some initialization code exactly once, which I ensure with the new
_once
private method. Concrete models should not implemented__init__
anymore, only_once
which is called byMCRun.__init__
exactly once and after the underlying implementation has been initialized.For Phojet and DPMJet, this does not work (at least I could not get it to work), so I followed the idea expressed in #106. I wrote new base class MCRunRemote, which a brilliant and terrible piece of engineering. Behind the scenes, it runs every call to
cross_section
and__call__
through a remote process which is created and dies just for the call. The beauty of it: a model which requires this workaround just has to inherit from MCRunRemote instead of MCRun, no other change required!To maintain the illusion of having a single model and not one that is restarted all the time, I rely strongly on the random-state serialization feature that @jncots added. I further improved this so that the PRNG state of the numpy generator is also saved. Since this so important, I added more tests that check the RNG state serialization and found some issues. @jncots could you please look into this?
Other changes
MCRun.get_stable
returns set of particles that were changed to stableMCRun.__call__
instead ofMCRun.__init__
; this makes more sense for models that are restartable. Some models need a maximum energy upfront, this can be passed via init.MCRun.kinematics
property was removed. The kinematics should not be part of the state of the generator, we pass them explicitly to the two functions which need them,MCRun.__call__
andMCRun.cross_section
. Less class state is generally better, it simplifies the design and the reasoning about something. It does not matter that the underlying Fortran stores the kinematics, because we abstracted that away in our high-level API.MCRun.random_state
now includes the state of the numpy PRNGMCRun._composite_plan
now generates heavy elements first, as a workaround for DPMJetMCRun.set_unstable
was renamed toMCRun.maydecay