Lower all-to-all communication volume in fft transposes #110

JHopeCollins · 2023-03-28T16:03:37Z

Currently we transpose a complex array before/after doing the fft/ifft in the preconditioner.
The forward transpose before the fft and the backward transpose after the ifft could be on real arrays, which would halve the communication volume. This would need a second pencil/transfer to be created for the real transposes.

JHopeCollins · 2023-07-24T09:44:51Z

The mpi4py-fft module uses Alltoallw for the transpose, which can take different lengths and types of data on each rank. As far as I know this is rarely optimised by vendors because its so general, so is just implemented as a big isend/irecv round.
If we can change this to Alltoallv then it might improve the performance because this is more likely to be optimised for proper nlogn collective performance.

JHopeCollins · 2023-07-24T09:46:24Z

Benchmark current implementation (all complex transposes and alltoallw)
Benchmark reduced transpose volume (with alltoallw)
- real-complex transposes
- reduced precision
Benchmark alltoallv (all complex transposes and alltoallv)
Benchmark both modifications (lower communication volume and alltoallv)

JHopeCollins · 2024-02-08T10:39:54Z

We could also try communicating in lower precision. All of the computation is done with PETSc so we are limited to double precision there (or whatever PETSc has been compiled with), but the transposes are just mpi4py calls with numpy arrays. We could do these (and the fft/ifft) with a lower precision to reduce communication volume again.

JHopeCollins added the Core functionality Adding to the main paradiag functionality label Mar 28, 2023

JHopeCollins changed the title ~~Lower all-to-all communication volume~~ Lower all-to-all communication volume in fft transposes Mar 28, 2023

JHopeCollins added the performance Improving or fixing the computational performance label Jun 22, 2023

JHopeCollins self-assigned this Jun 26, 2023

JHopeCollins linked a pull request Jun 27, 2023 that will close this issue

WIP: Only transpose real-valued data where possible #123

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lower all-to-all communication volume in fft transposes #110

Lower all-to-all communication volume in fft transposes #110

JHopeCollins commented Mar 28, 2023

JHopeCollins commented Jul 24, 2023

JHopeCollins commented Jul 24, 2023 •

edited

Loading

JHopeCollins commented Feb 8, 2024

Lower all-to-all communication volume in fft transposes #110

Lower all-to-all communication volume in fft transposes #110

Comments

JHopeCollins commented Mar 28, 2023

JHopeCollins commented Jul 24, 2023

JHopeCollins commented Jul 24, 2023 • edited Loading

JHopeCollins commented Feb 8, 2024

JHopeCollins commented Jul 24, 2023 •

edited

Loading