Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading in a large detector from json is slow #592

Open
sjoerd-bouma opened this issue Nov 16, 2023 · 5 comments
Open

Reading in a large detector from json is slow #592

sjoerd-bouma opened this issue Nov 16, 2023 · 5 comments

Comments

@sjoerd-bouma
Copy link
Collaborator

Reading in a large detector description from a .json file is slower than expected; for the LOFAR detector description (O(1000) channels), reading in the json description using

det = Detector('LOFAR.json', antenna_by_depth=False)

takes O(1 minute). If instead we read in the json first, and then load the detector from the dictionary, this only takes O(1 s):

with open('LOFAR.json', 'r') as f:
    det_dict = json.load(f)
det = Detector(json_filename=None, source='dictionary', dictionary=det_dict, antenna_by_depth=False)

I'm not sure how the first case is currently implemented - most time is spent in tinydb_serialization._decode_deep and json.decoder.raw_decode. However, seeing as the second case is (?) equivalent to the first, we should probably change the implementation to just internally read in the json as a dictionary first, and load the detector from that.

@MijnheerD
Copy link
Collaborator

Something I do not entirely get though, is that on lines 95 and 96 of generic_detector.py it seems like the JSON file decoded and the Detector is constructed using the dictionary on line 97-99. So would the time be spent on the Query() calls then?

@anelles
Copy link
Collaborator

anelles commented Nov 21, 2023

If I am not fully mistaken @cg-laser was the one to introduce tiny_db in here. Maybe it really is too tiny for what we are doing here?
I would be open to changing it, if we can make it work seamlessly with what @jakobhenrichs and @fschlueter have been implementing for the DB-based detector.

@fschlueter
Copy link
Collaborator

@anelles, I do not think so. The RNO-G detector class uses a custom buffering "strategy", I do not think that this is transferable to this detector.

@cg-laser
Copy link
Collaborator

TinyDB is still used when giving a json object to the detector class. I second your suggestion Sjoerd. It seems to efficiently solve the problem.

@MijnheerD
Copy link
Collaborator

As a development on this issue, when loading a complete detector description the initialisation is much faster. So I think this is particular issue when trying to load a GenericDetector with lots of references, which is where I would guess it spends most of its time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants