Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A More Scalable Sparse Dynamic Data Exchange (2308.13869v2)

Published 26 Aug 2023 in cs.DC

Abstract: Parallel architectures are continually increasing in performance and scale, while underlying algorithmic infrastructure often fail to take full advantage of available compute power. Within the context of MPI, irregular communication patterns create bottlenecks in parallel applications. One common bottleneck is the sparse dynamic data exchange, often required when forming communication patterns within applications. There are a large variety of approaches for these dynamic exchanges, with optimizations implemented directly in parallel applications. This paper proposes a novel API within an MPI extension library, allowing for applications to utilize the variety of provided optimizations for sparse dynamic data exchange methods. Further, the paper presents novel locality-aware sparse dynamic data exchange algorithms. Finally, performance results show significant speedups up to 20x with the novel locality-aware algorithms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. A. Bienz, D. Schafer, and A. Skjellum, “Mpi advance : Open-source message passing optimizations,” in EuroMPI’23: 30th European MPI Users’ Group Meeting, 2023. [Online]. Available: https://eurompi23.github.io/assets/papers/EuroMPI23_paper_33.pdf
  2. B. Vastenhouw and R. H. Bisseling, “A two-dimensional data distribution method for parallel sparse matrix-vector multiplication,” SIAM Review, vol. 47, no. 1, pp. 67–95, 2005. [Online]. Available: https://doi.org/10.1137/S0036144502409019
  3. A. Lazzaro, J. VandeVondele, J. Hutter, and O. Schütt, “Increasing the efficiency of sparse matrix-matrix multiplication with a 2.5d algorithm and one-sided mpi,” in Proceedings of the Platform for Advanced Scientific Computing Conference, ser. PASC ’17.   New York, NY, USA: Association for Computing Machinery, 2017. [Online]. Available: https://doi.org/10.1145/3093172.3093228
  4. Y. Ishikawa, K. Nakajima, and A. Hori, “Revisiting persistent communication in mpi,” in Recent Advances in the Message Passing Interface, J. L. Träff, S. Benkner, and J. J. Dongarra, Eds.   Berlin, Heidelberg: Springer Berlin Heidelberg, 2012, pp. 296–297.
  5. M. Hatanaka, A. Hori, and Y. Ishikawa, “Optimization of mpi persistent communication,” in Proceedings of the 20th European MPI Users’ Group Meeting, ser. EuroMPI ’13.   New York, NY, USA: Association for Computing Machinery, 2013, p. 79–84. [Online]. Available: https://doi.org/10.1145/2488551.2488566
  6. R. E. Grant, M. G. F. Dosanjh, M. J. Levenhagen, R. Brightwell, and A. Skjellum, “Finepoints: Partitioned multithreaded mpi communication,” in High Performance Computing, M. Weiland, G. Juckeland, C. Trinitis, and P. Sadayappan, Eds.   Cham: Springer International Publishing, 2019, pp. 330–350.
  7. M. G. Dosanjh, A. Worley, D. Schafer, P. Soundararajan, S. Ghafoor, A. Skjellum, P. V. Bangalore, and R. E. Grant, “Implementation and evaluation of mpi 4.0 partitioned communication libraries,” Parallel Computing, vol. 108, p. 102827, 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0167819121000752
  8. J. L. Träff and A. Rougier, “Mpi collectives and datatypes for hierarchical all-to-all communication,” in Proceedings of the 21st European MPI Users’ Group Meeting, ser. EuroMPI/ASIA ’14.   New York, NY, USA: Association for Computing Machinery, 2014, p. 27–32. [Online]. Available: https://doi.org/10.1145/2642769.2642770
  9. X. Luo, W. Wu, G. Bosilca, Y. Pei, Q. Cao, T. Patinyasakdikul, D. Zhong, and J. Dongarra, “Han: a hierarchical autotuned collective communication framework,” in 2020 IEEE International Conference on Cluster Computing (CLUSTER), 2020, pp. 23–34.
  10. A. Bienz, W. D. Gropp, and L. N. Olson, “Node aware sparse matrix–vector multiplication,” Journal of Parallel and Distributed Computing, vol. 130, pp. 166–178, 2019.
  11. ——, “Reducing communication in algebraic multigrid with multi-step node aware communication,” The International Journal of High Performance Computing Applications, vol. 34, no. 5, pp. 547–561, 2020.
  12. M. Hidayetoğlu, T. Bicer, S. G. de Gonzalo, B. Ren, V. De Andrade, D. Gursoy, R. Kettimuthu, I. T. Foster, and W.-m. W. Hwu, “Petascale xct: 3d image reconstruction with hierarchical communications on multi-gpu nodes,” in SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, 2020, pp. 1–13.
  13. B. Priest, T. Steil, G. Sanders, and R. Pearce, “You’ve got mail (ygm): Building missing asynchronous communication primitives,” in 2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2019, pp. 221–230.
  14. J. L. Träff and S. Hunold, “Decomposing mpi collectives for exploiting multi-lane communication,” in 2020 IEEE International Conference on Cluster Computing (CLUSTER), 2020, pp. 270–280.
  15. G. Shipman, “Cellar,” https://github.com/lanl/CELLAR, 2022.
  16. T. Hoefler, C. Siebert, and A. Lumsdaine, “Scalable communication protocols for dynamic sparse data exchange,” SIGPLAN Not., vol. 45, no. 5, p. 159–168, jan 2010. [Online]. Available: https://doi.org/10.1145/1837853.1693476
  17. T. A. Davis and Y. Hu, “The university of florida sparse matrix collection,” ACM Trans. Math. Softw., vol. 38, no. 1, dec 2011. [Online]. Available: https://doi.org/10.1145/2049662.2049663

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets