Lower all-to-all communication volume in fft transposes #110
Labels
Core functionality
Adding to the main paradiag functionality
performance
Improving or fixing the computational performance
Currently we transpose a complex array before/after doing the fft/ifft in the preconditioner.
The forward transpose before the fft and the backward transpose after the ifft could be on real arrays, which would halve the communication volume. This would need a second pencil/transfer to be created for the real transposes.
The text was updated successfully, but these errors were encountered: