Modeling Data Movement Performance on Heterogeneous Architectures (2010.10378v3)

Published 20 Oct 2020 in cs.DC

Abstract: The cost of data movement on parallel systems varies greatly with machine architecture, job partition, and nearby jobs. Performance models that accurately capture the cost of data movement provide a tool for analysis, allowing for communication bottlenecks to be pinpointed. Modern heterogeneous architectures yield increased variance in data movement as there are a number of viable paths for inter-GPU communication. In this paper, we present performance models for the various paths of inter-node communication on modern heterogeneous architectures, including the trade-off between GPUDirect communication and copying to CPUs. Furthermore, we present a novel optimization for inter-node communication based on these models, utilizing all available CPU cores per node. Finally, we show associated performance improvements for MPI collective operations.

Citations (9)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Modeling Data Movement Performance on Heterogeneous Architectures (2010.10378v3)

Summary

Related Papers