Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take into account covariance matrix for daemonflux parameters #766

Merged
merged 9 commits into from
Apr 23, 2024

Conversation

marialiubarska
Copy link
Contributor

Adding support for penalty term calculations taking into account daemonflux parameters covariance matrix. Thanks to daemonflux developers we are able to use prior penalty calculation from daemonflux package. This value is stored in a daemon_chi2 parameter, which is created and updated inside the daemon flux stage (no need to define anything in config files).

IMPORTANT: must ensure that daemonflux version is >= 0.8.0, otherwise the chi2 penalty calculation is not correct!
I have added a requirement to setup.py, but also had to add a check inside the stage itself in case user does not reinstall their pisa after pulling the changes. This also adds dependency on packaging, but I don't think it's a big issue.

If anyone has comments/suggestions let me know!

PLEASE DO NOT MERGE THIS PR BEFORE INITIALLY ASSIGNED REVIEWERS HAVE A CHANCE TO TAKE A LOOK.

Maria Liubarska and others added 5 commits April 11, 2024 13:49
…t covariance matrix (using chi2 penalty calculated by daemonflux package)
…pendencies (used to check version inside the stage)
…nstall their pisa. Had to add packaging dependency for this
@@ -862,6 +862,33 @@ def __init__(self, *args):
# I think because the changed units are cached in the object (Philipp)
self.normalize_values = True

# store list of daemonflux params names for prior penalty calculation
self._daemon_names = ['K_158G',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure future compatibility it would be good read the daemonflux parameter names from Flux().params.known_parameters instead of a hard coded list. If, for some reason, you want to keep a fixed parameter list, you may want to implement a check if what is returned by Flux().params.known_parameters is the same as what you expect it to be.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to replace the names use simple string ops:

self._daemon_names = [
     p.replace("pi+","pi").replace("pi-","antipi").replace(
          "K+","K").replace("K-","antiK")
      for p in Flux().params.known_parameters
]```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to limit daemonflux dependency to the corresponding stage, that's why I wanted to avoid calling it directly in ParamSet class (that's also why chi2 value is stored into a parameter inside the stage). But I also had to add daemonflux to required packages anyway for automatic installation to work and now to restrict the version, so maybe there is no harm in doing it this way.
I will look at this tomorrow.

I didn't think about renaming the parameters automatically, that's a good suggestion, thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe another solution which avoids storing the parameter list altogether is to just add a prefix to all daemonflux parameters names, since we have to rename them anyway to remove symbols, then I can just look for that prefix as a condition to skip these parameters...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I should just do that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not having a hard coded list in param.py would be better

@afedynitch
Copy link

I have also some general comment to the implementation. Since daemonflux already interpolates there is no need to make flux maps everytime the parameters change. You can call daemonflux directly (see below).

If for some reason this has to happen this way, there is no need to loop over angles because the flux function supports array input for energy and zenith.

fl = Flux(location="IceCube")
cos_zen = np.arange(1,-1.01,-0.1)
egr = np.logspace(1,5,40)
print(egr.shape, cos_zen.shape, fl.flux(egr, np.degrees(np.arccos(cos_zen)), "numu").shape)
> (40,) (21,) (40, 21)

# I think you should call the the flux function directly and
# there is no need to do the interpolation and then rereading the interpolation.
# Daemonflux already interpolates. And it's "relatively" fast.
%timeit fl.flux([50.], 125., "numu")
> 353 µs ± 584 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

@marialiubarska
Copy link
Contributor Author

I have also some general comment to the implementation. Since daemonflux already interpolates there is no need to make flux maps everytime the parameters change. You can call daemonflux directly (see below).

If for some reason this has to happen this way, there is no need to loop over angles because the flux function supports array input for energy and zenith.

fl = Flux(location="IceCube")
cos_zen = np.arange(1,-1.01,-0.1)
egr = np.logspace(1,5,40)
print(egr.shape, cos_zen.shape, fl.flux(egr, np.degrees(np.arccos(cos_zen)), "numu").shape)
> (40,) (21,) (40, 21)

# I think you should call the the flux function directly and
# there is no need to do the interpolation and then rereading the interpolation.
# Daemonflux already interpolates. And it's "relatively" fast.
%timeit fl.flux([50.], 125., "numu")
> 353 µs ± 584 ns per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

If I remember correctly the reason it was done this way was that we only need to produce one value of flux for each pair of corresponding E and coszen values, so after consulting with @jpyanez we decided to re-interpolate it and then use the maps. I can look at this again, maybe there is a way to improve this

@afedynitch
Copy link

This is exactly what the call fl.flux([50.], 125., "numu") does, it calls daemonflux for one value pair. As I said, it already interpolates, so there is no need to interpolate again.

@anilak41 anilak41 requested a review from JanWeldert April 12, 2024 13:19
@JanWeldert
Copy link
Contributor

@anilak41 if these changes work with your setting, I think the PR would be ready to merge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using this branch, I tried to calculate the pull penalty, and it seems to give a number as expected. However, I have the following observations/concerns:

  • There are two other functions also params.priors_chi2 and params.priors_llh. I believe their output are supposed be in agreement with params.priors_penalty(metric='mod_chi2'). However, those two still give the priors without incorporating the covariance matrix of daemonflux.

  • params.priors_penalty(metric='llh') gives output same as params.priors_penalty(metric='mod_chi2') for modification to daemonflux parameters. If I modify other parameters, let's say dom_eff then params.priors_penalty(metric='llh') give me a value which is half of params.priors_penalty(metric='mod_chi2') as expected. I think this can cause an issue if someone performs minimization using llh as the metric. Therefore, we need introduce a factor of 1/2 if the metric is llh.

  • If I modify the values of the daemonflux parameters in params then params.priors_chi2 and params.priors_llh give me updated priors immediately. But params.priors_penalty(metric='llh') gives me an updated value only after I run template_maker.get_outputs(return_sum=True). Otherwise old value is given. If someone loads a fitted paramset and tries to calculate the params.priors_penalty without generating a template using get_outputs then the contribution from daemonflux parameter via covariance matrix will be missing but other parameters will still give a contribution to the prior penalty. This can cause confusion and go unnoticed leading to a wrong prior penalty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JanWeldert Calculation is ok when the metric is chi2 but these inconsistencies mentioned in the previous message can cause confusion and issues. More specifically, if some try to use LLH, then they can get the wrong numbers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, these are good points.

  • You could modify params.priors_chi2 and params.priors_llh in the same way as params.priors_penalty, but we could also think about removing the functions completely since they are just calling the prior penalty function of the Param class anyway. params.priors_penalties will also ignore the correlation, would be good to exclude the daemon params here too.
  • Here you can simply check if metric in LLH_METRICS: and multiply the chi2 value by 0.5 if so. You can also make sure that metric in CHI2_METRICS otherwise.
  • Not sure if I understand that point. If you want to update the daemonflux chi2. you have to run the distribution maker since it is calculated in the flux stage. The individual contributions of the daemonflux parameters are ignored by params.priors_penalty (as they should be) but currently taken into account by params.priors_chi2 and params.priors_llh because they don't know about the correlation (your first point).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anilak41 Thanks for pointing out the issue with LLH conversion, I added a fix for this

@anilak41 anilak41 self-requested a review April 18, 2024 14:17
@marialiubarska marialiubarska merged commit b76e82d into master Apr 23, 2024
2 checks passed
@marialiubarska marialiubarska deleted the covar branch April 23, 2024 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants