Explain lower GPU‑aware MPI latency on Perlmutter versus Summit

Determine the factors responsible for Perlmutter’s substantially lower GPU‑aware MPI message latency compared to Summit as observed in the PETScSF microbenchmarks, with particular emphasis on identifying behaviors within IBM Spectrum MPI on Summit and distinguishing software versus hardware contributions to the observed gap.

Background

The paper measures one-way GPU-aware MPI message latency between the two closest GPUs per node using PETSc’s SF-pingpong and SF-unpack microbenchmarks on Summit, Perlmutter, Frontier, and Aurora.

The authors observe that Perlmutter exhibits much lower latency than Summit for small and medium messages. While they hypothesize that newer hardware and software may contribute, they explicitly state that they lack sufficient understanding of IBM Spectrum MPI’s behavior to pinpoint the cause, leaving the precise reason unresolved.

References

From Figure \ref{fig:SF-Pingpong}, we can see that Perlmutter has much lower latency than Summit, presumably due to its newer hardware and software (see Table \ref{tab:machines}), though we do not know enough about the workings of IBM Spectrum MPI to say exactly why.

PETSc/TAO Developments for GPU-Based Early Exascale Systems (2406.08646 - Mills et al., 12 Jun 2024) in Section 4.4 (GPU-aware MPI message passing latency on (pre-)exascale machines), paragraph discussing Figure \ref{fig:SF-Pingpong}