You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By the way, my colleague (@halehawk) is working on some stuff in a fork of this repository and is planning on doing a merge at some point in the future. One of the things @halehawk is working on is a platform service to hold all benchmarking results/plots submitted from other people using the same benchmarking utility. She's done some thing to address other issues in this repo, too.
Anyway, I am hopeful that after the merge, we can collaborate on this and maybe get some benchmarking measurements with Dask+Infiniband!
@tinaok@kmpaul, it is a good idea. Just I am wondering if Dask works with
RDMA optimised communication lib or not, if not, how many efforts need to
make it available?
On Tue, Jan 26, 2021 at 10:07 AM Kevin Paul ***@***.***> wrote:
This is great, @tinaok <https://github.com/tinaok>! Thanks for the ping.
By the way, my colleague ***@***.*** <https://github.com/halehawk>) is
working on some stuff in a fork of this repository
<https://github.com/NCAR/benchmarking> and is planning on doing a merge
at some point in the future. One of the things @halehawk
<https://github.com/halehawk> is working on is a platform service to hold
all benchmarking results/plots submitted from other people using the same
benchmarking utility. She's done some thing to address other issues in this
repo, too.
Anyway, I am hopeful that after the merge, we can collaborate on this and
maybe get some benchmarking measurements with Dask+Infiniband!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#43 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAPEFHP6JFJ43MEMPS6JELS33ZGDANCNFSM4WTEZBLQ>
.
@halehawk: Yes. It sounds like (@tinaok, correct me if I'm wrong) the new Dask+Infiniband work will use RDMA optimization. Which could be a huge benefit!
Basic installation of pangeo on infiniband cluster, use Tcp ip communication. Thus not benefitting from it's 'real' high speed /band width communication. Using RDMA connection between dask clients , running on an infiniband based cluster, should speed up it's communication..
There are benchmarks on infiniband cluster with GPU's using UCXPY or MPI4Dask. (https://blog.dask.org/2019/06/09/ucx-dgx, https://www.hpcadvisorycouncil.com/events/2020/australia-conference/pdf/HighPerfDeepMachineLearnonHPCSyst_010920_DKPanda.pdf, slide 46-47, http://hibd.cse.ohio-state.edu/features/#mpi4dask)
Our pangeo bench is based on CPU, and results we have in our repo uses infiniband based HPC clusters. Benchmarking of pangeo, for communication-bound (like rechunking, ..) may get speed up.
The text was updated successfully, but these errors were encountered: