You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@delarosatrevin mentioned that his reason for not using starfile was inability to access the file in a paginated way typical of web API's
This was an intentional design decision but I'm wondering if we could expose this in a useful way from the StarParser and what the API should look like?
@alisterburt , I haven't completely explored starfile, but since it is based on DataFrames I guessed some limitations that I found common in CryoEM starfiles (or metadata in general) handling/processing.
Read all datablock names (if you are just reading a star file and want to know what blocks are for further reading)
Know how many rows are in each data block
The issue you mentioned about pagination, allowing starting from some index and reading N rows.
Iterating over the rows without reading all of them in memory, which in some cases is useful for filtering and re-writing a subset of the star
Read only the table definition without parsing, which it is useful for writing a similar table with different rows
There I separated a Table from the StarFile which handles the parsing. Maybe can you reuse the parsing part to generate the data frame the starfile library? I should be trivial to generate a DataFrame from Table there and it might be helpful to other type of metadata files.
I'm happy to hear any feedback or make any changes if you find inconsistency in the API, since it is still in a very alpha stage, although I have used for myself in emhub and some Scipion plugins.
@delarosatrevin mentioned that his reason for not using starfile was inability to access the file in a paginated way typical of web API's
This was an intentional design decision but I'm wondering if we could expose this in a useful way from the
StarParser
and what the API should look like?I've seen a nice API for this in
SQLModel
that we could try to replicate here, open to other suggestions too thoughe.g. to get
nrows
rows after a certainoffset
from a data block called'block'
you would do something likethoughts @delarosatrevin?
The text was updated successfully, but these errors were encountered: