Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/ehinkle geo resource charge coord fix #86

Merged
merged 20 commits into from
Dec 16, 2023

Conversation

edhinkle
Copy link
Member

@edhinkle edhinkle commented Dec 1, 2023

Updates to proto_nd_flow Geometry module in order to remove hard-coded dependencies on outdated geometry, change dataset and method names which reference outdated coordinate systems, and update/add information in the output geometry_info dataset for analyzers working with flow files.

Elise Dianne Hinkle and others added 14 commits November 7, 2023 07:22
…coordinates e.g. pixel_xy, anode_z, and get_z_coordinate.
…te_regions methods of proto_nd_flow/resources/geometry.py.
…o geometry_info flow dataset in proto_nd_flow.
…ounds, module_RO_bounds, max_drift_distance, and cathode_thickness attrs and remove drift_regions attr.
…ng Geometry module testing infrastructure.
@edhinkle
Copy link
Member Author

edhinkle commented Dec 1, 2023

Full description of changes made is here.

@edhinkle
Copy link
Member Author

edhinkle commented Dec 1, 2023

For validation:

  • Used my updated Geometry module to flow 5 MiniRun4 larndsim files
  • Printed out new attributes during flow to check values
  • Compared validation plots created from my flowed files to those from the corresponding MiniRun4 flow files (binning in validation script changed slightly to account for binning issue in y position histogram)
  • Used small flow module to test in_fid() method

Validation plots for my flowed files ("UPDATED" validations vs. "original" validations in file names) and the corresponding MiniRun4 flow files are attached. In order to run the in_fid() test flow module above, navigate to ndlar_flow/scripts/proto_nd_scripts/ within feature/ehinkle_geo_resource_charge_fix and run run_teo_test_fix.sh on a file flowed using the geometry updates. You can either flow the file yourself or use the files at /global/cfs/cdirs/dune/users/ehinkle/nd_prototypes_ana/flow_tests/MiniRun4_LARNDSIM_test_files/FLOW_OUTPUT_GEO_FIX on NERSC (assuming they're accessible to others).

Validation Plots:
MiniRun4_1E19_RHC.flow.00000.FLOW_original_validations.pdf
MiniRun4_1E19_RHC.flow.00000.FLOW_UPDATED_validations.pdf
MiniRun4_1E19_RHC.flow.00010.FLOW_original_validations.pdf
MiniRun4_1E19_RHC.flow.00010.FLOW_UPDATED_validations.pdf
MiniRun4_1E19_RHC.flow.00231.FLOW_original_validations.pdf
MiniRun4_1E19_RHC.flow.00231.FLOW_UPDATED_validations.pdf
MiniRun4_1E19_RHC.flow.00598.FLOW_original_validations.pdf
MiniRun4_1E19_RHC.flow.00598.FLOW_UPDATED_validations.pdf
MiniRun4_1E19_RHC.flow.00824.FLOW_original_validations.pdf
MiniRun4_1E19_RHC.flow.00824.FLOW_UPDATED_validations.pdf

@edhinkle edhinkle requested a review from YifanC December 1, 2023 17:57
@YifanC
Copy link
Collaborator

YifanC commented Dec 8, 2023

Hi Elise @edhinkle , Thank you for updating the geometry! In general they look good. I only have some minor comments. Feel free to disregard any of these.

I realised that you also changed some units within ndlar_flow [mm] -> [cm] for hits starting from the drift distance. In general it is good, but unit conversion is an additional step that needs to be carried out all the way. The inconsistency persists with the larpix geometry in [mm] and drift velocity modeling in [mm]. I wonder if it's easier to do the conversion only in the very end, so the unit conversion is only carried out once? (I actually don't know why hit and raw hit is calling the same thing twice... Need to look into it more carefully.) If we keep the [cm] as you proposed here, I wonder if we should change larpix geometry and drift velocity modeling in the next iteration. My concern is the future development will need to carry this unit conversion in the mid-way. Is it obvious for the future developer where to add it? However, the updates itself is consistent and I don't think it has any problems.

For this block, I wonder if it should go into a different function? Either way is fine. I think it's quite separate from the first half, and just uses the product of it. I also see that if you separate it, then you will need to load two functions, and both parts are indeed the charge properties.

Perhaps the 999999999 can be switched to something like np.finfo(self.pixel_coordinates_2D_dtype).max?

Another thing I didn't understand is this part. The use of anode_drift_coordinate.

Again, the modification seems consistent to me. Most of what I said is cosmetic than actually useful.

@edhinkle
Copy link
Member Author

edhinkle commented Dec 8, 2023

@YifanC -- thank you for reviewing the branch and for these comments! I will address them one by one:

1.) [units] -- I agree that the unit situation in this branch is not ideal. The reason that I decided to convert starting at drift_distance was to keep consistency between the units of Geometry datasets/attributes ([cm]) and the units of all inputs to Geometry class methods used elsewhere in flow (e.g. get_drift_coordinate(), which has drift_distance as an input parameter). I would advocate to keep the changes I've suggested and then change the LArPix geometry file and LArData Resource in future versions of proto_nd_flow to also use [cm]. I believe all conversions between [mm] and [cm] are now isolated to a.) inside the Geometry resource source code where the LArPix geometry is read out and b.) any time a method/attribute from the LArData resource is called in another flow module and outputs a quantity with distance units/ units which involve distance units (e.g. velocity, volume, etc.).

2.) [this block -> different function? ] -- This is a good point. The main reason I didn't separate this block into another method was that this block requires loading the geometry file to get mod_centers, and I didn't want to load that in two separate methods. I also wanted to take advantage of the fact that I was already looping through modules in _load_charge_geometry (most of the linked block is also within this loop over modules). I don't have a sense as to how much these choices actually help computationally. If they don't make much of a difference, I think making the block into a separate method makes sense. Do you have a sense as to how the improved readability of putting this block in a different method compares to any potential computational advantages to keeping everything in a single method?

3.) [999999999] -- I agree. I'll make the change you suggested.

4.) [anode_drift_coordinate in this part] -- Looking back, I agree that this section is unintuitive. I'll try to explain here. This block is in the middle of a loop through io_groups associated with a particular module (it's also inside a loop through modules). anode_drift_coordinate takes a tile_id and spits out the corresponding drift coordinate (x) of the associated anode, so lines 604-605 gets the drift (x) coordinate of the relevant anode for the io_group. The rest of the block is trying to deal with the fact that each module has two anodes/io_groups, but the order in which the io_groups are looped does not necessarily map to min and max drift/x coordinates. Therefore, for the first io_group in the loop, I assign both min_x and max_x to have the value of the anode_drift_coordinate, and for the second io_group in the loop, I check the new anode_drift_coordinate against the previous anode_drift_coordinate ( which is saved in both min_x and max_x at this point) to determine the true min_x and max_x for this module. If you have any suggestions on how to streamline and/or clarify this section, please let me know. Either way, I can additional commenting/explanations to this section.

Thanks again for your comments!

@YifanC
Copy link
Collaborator

YifanC commented Dec 15, 2023

Hi Elise, Thanks for the comments.

  1. I think I'm fine with the change, but hope it is documented (note or wiki) that the unit conversion happens at which stage (not a requirement).
  2. Another reason I made this comment: this blocks redefines io_group, io_channel, chip_id and channel_id which may be confusing.
  3. I didn't see the commit to address this. Maybe I missed it.
  4. You could use "drift direction" of the tile. If the drift of the tile is +1, then the position should be min_x; and if the drift of the tile is -1, then the position should be max_x. (Please double check). What you have works, so this is just a suggestion.
  5. An extra: Perhaps it's better to loop through module instead of io_group here. Because for nd module, there will be multiple io_groups per tpc, then this version wouldn't work (there's something else in this script is 2x2 specific and I intend to change... will get to it one day= =). You could loop through the modules (from "module_to_io_groups" in the detector yaml for example), and use one tile id per tpc from the "tile_map" for example.

Elise Dianne Hinkle added 2 commits December 15, 2023 11:11
…rge_geometry method in proto_nd_flow Geometry module.
@edhinkle
Copy link
Member Author

Hi Yifan! Thanks for the follow up. Here are some notes on changes I made in response to your comments:

  1. Sounds good. I'll look into making sure this is documented somewhere.
  2. Makes sense. I broke out the module_RO_bounds portion into a separate helper function, _get_module_RO_bounds() (see recent commit).
  3. See recent commit.
  4. I decided not to do this because it would add multiple additional lines of code (especially given the ND-LAr case of multiple io_groups per TPC) without additional functionality. I tried to put more comments around the block to increase clarity instead (see recent commit).
  5. The loop over io_group is already inside of a loop over modules using "module_to_io_group." This block is now inside the new helper method _get_module_RO_bounds() (see recent commit). I've re-checked the code, and I'm fairly certain that it already deals with the possibility of multiple io_groups being associated with one tpc. If there are specific lines which seem concerning, please let me know.

Copy link
Member

@krwood krwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good by eye. I think we should remove the older 2.3.16 layout from the data/proto_nd_flow to avoid confusion. I did not review your test modules very closely because they won't be used in the simulation & data processing routines during productions.

@edhinkle edhinkle merged commit ff584f4 into develop Dec 16, 2023
2 checks passed
@edhinkle edhinkle deleted the feature/ehinkle_geo_resource_charge_coord_fix branch October 2, 2024 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Inconsistencies in proto_nd_flow Geometry Resource Source Code Labeling
3 participants