Explain lower GPU‑aware MPI latency on Perlmutter versus Summit
Determine the factors responsible for Perlmutter’s substantially lower GPU‑aware MPI message latency compared to Summit as observed in the PETScSF microbenchmarks, with particular emphasis on identifying behaviors within IBM Spectrum MPI on Summit and distinguishing software versus hardware contributions to the observed gap.
Sponsor
References
From Figure \ref{fig:SF-Pingpong}, we can see that Perlmutter has much lower latency than Summit, presumably due to its newer hardware and software (see Table \ref{tab:machines}), though we do not know enough about the workings of IBM Spectrum MPI to say exactly why.
— PETSc/TAO Developments for GPU-Based Early Exascale Systems
(2406.08646 - Mills et al., 12 Jun 2024) in Section 4.4 (GPU-aware MPI message passing latency on (pre-)exascale machines), paragraph discussing Figure \ref{fig:SF-Pingpong}