Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revived proposal for representing BlueskyRuns in Tiled #824

Open
danielballan opened this issue Dec 6, 2024 · 7 comments
Open

Revived proposal for representing BlueskyRuns in Tiled #824

danielballan opened this issue Dec 6, 2024 · 7 comments

Comments

@danielballan
Copy link
Member

danielballan commented Dec 6, 2024

Brief summary of discussions with @padraic-shafer, @genematx, and @tacaswell, drawing on earlier discussions with @dylanmcreynolds and @whs92.


Goals:

  • Provide a stable URL to a given column of data, regardless of whether it is in-line in the Event documents or external.
  • Data in a given stream should be presented to the user in a flat namespace, not placing in the URL path the details about internal versus external storage.
  • Server should reveal to clients how the data is grouped into tables and arrays underneath, facilitating efficient chunked access (i.e. not separate requests for each column).
  • It should be possible to add "views", such as simplified/flatter views, of the data in a BlueskyRun. (NeXus does something similar.) A canonical example is re-mapping a mapping scan (1D in the canonical Bluesky representation) into a more scientifically relevant N-dimensional view (reshaping).
  • Lay track for potentially giving a way to access auxiliary information (e.g. proxied from Archiver Appliance).

Revive #668 which proposed two ideas.

  • A new structure family named union "consolidated" which let us represent a Bluesky Stream, backed by a combination of tables and arrays, nicely
  • A concept of "views", alternate (simplified...) layouts of the canonical data

Both suggestions were developed and viewed positively at the time, but set aside back in March to make progress on other things. Now, we propose to implement both.

In URLs:

# Bluesky stuff...
.../<uuid>/streams/<stream name>  # "consolidated" structure family
.../<uuid>/streams/<stream name>?part=<part>  # table of columns stored together, or array
.../<uuid>/streams/<stream name>/<key>  # array
.../<uuid>/config/<stream name>/<obj name>  # table of configuration readings

# New concept: simplified/rearranged "views":
.../<uuid>/views/<view name>/...  # server-side registerable views

# New concept: auxiliary info (e.g. from EPICS Archiver Appliance)
../<uuid>/auxiliary/<aux name>/...  # information from outside Tiled

In Python API:

run.streams["primary"].read()  # xarray.Dataset
run.streams["primary"]["image"]  # array
run.config["primary"]["fccd"].read()  # table of configuration readings

run.views["simple"].read()
# and equivalently...
run.read(view="simple")  # might even be the default for run.read(), which currently raises...
run.aux["archiver"]["PV..."].read()  # array
@dylanmcreynolds
Copy link
Contributor

Nice summary. I haven't really thought about this auxiliary concept. It's fascinating to me but feels lower priority than the other endpoints?

@danielballan
Copy link
Member Author

Yes, we fully agree. New concept that came up this week, and included only to buttress the idea that having a namespace in that position (streams, config, views, aux) open up multiple potentially interesting paths.

@whs92
Copy link
Member

whs92 commented Dec 10, 2024

I think the ability to also easily access archive data is amazing.

I think your proposed namespace makes sense.

@tacaswell
Copy link
Contributor

We had the ability to enrich a bluesky run with additional (synthetic) streams from the archiver appliance in databroker v0:
https://github.com/bluesky/databroker/blob/main/databroker/eventsource/archiver.py .

@danielballan
Copy link
Member Author

I like that this scopes the auxiliary data (e.g. archiver) inside the BlueskyRun (good!) but outside the streams. I think the distinction between "a part of the original document stream" and "stapled on later for convenience" is worth surfacing at this level.

@callumforrester
Copy link

This looks interesting! Could the views concept be considered analogous to nexus application definitions? If so it opens the door to things we struggled to do in nexus-land, such as a stricter schema.

@dylanmcreynolds
Copy link
Contributor

@callumforrester that was definitely something we were thinking when we were talking about it a while back. I think of views as sort of a replacement for Databroker's projection/projector facility, where you could map to one or more ontologies. Views have the advantage of being a little deeper in the data model. Projection really just affected the resulting xarray that Databroker produced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants