-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage Benchmarks #3
Comments
This repo has a lot of ideas to start from: https://github.com/rabernat/zarr_hdf_benchmarks @andersy005 has been running it on Cheyenne |
I believe some of this has been address by #44. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
There is already the https://github.com/pangeo-data/storage-benchmarks repository, which we can build on (possibly move into this repo). I think that these benchmarks should consider different formats:
And I think we need to compare these to their "idealized" use cases, which are independent I/O (i.e., each process reads/writes from/to its own file) for zarr and MPI-IO (each process reads/writes from/to the same file) for hdf5 and netcdf.
These benchmarks should be run on different platforms and storage systems (HPC with GPFS or Lustre, AWS S3, GCS, etc.).
What all do we need for this?
The text was updated successfully, but these errors were encountered: