In-Situ Techniques on GPU-Accelerated Data-Intensive Applications (2407.20731v1)
Abstract: The computational power of High-Performance Computing (HPC) systems is constantly increasing, however, their input/output (IO) performance grows relatively slowly, and their storage capacity is also limited. This unbalance presents significant challenges for applications such as Molecular Dynamics (MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of data for further visualization or analysis. At the same time, checkpointing is crucial for long runs on HPC clusters, due to limited walltimes and/or failures of system components, and typically requires the storage of large amount of data. Thus, restricted IO performance and storage capacity can lead to bottlenecks for the performance of full application workflows (as compared to computational kernels without IO). In-situ techniques, where data is further processed while still in memory rather to write it out over the I/O subsystem, can help to tackle these problems. In contrast to traditional post-processing methods, in-situ techniques can reduce or avoid the need to write or read data via the IO subsystem. They offer a promising approach for applications aiming to leverage the full power of large scale HPC systems. In-situ techniques can also be applied to hybrid computational nodes on HPC systems consisting of graphics processing units (GPUs) and central processing units (CPUs). On one node, the GPUs would have significant performance advantages over the CPUs. Therefore, current approaches for GPU-accelerated applications often focus on maximizing GPU usage, leaving CPUs underutilized. In-situ tasks using CPUs to perform data analysis or preprocess data concurrently to the running simulation, offer a possibility to improve this underutilization.
- “facebook/zstd: Zstandard - Fast real-time compression algorithm.” [Online]. Available: https://github.com/facebook/zstd
- Nsight systems | NVIDIA developer. [Online]. Available: https://developer.nvidia.com/nsight-systems
- Nvidia nsight systems. [Online]. Available: http://docs.nvidia.com/deploy/mps/index.html
- Quantum_espresso/CP/5r7y-COVID19 · master · max-centre / benchmarks · GitLab. [Online]. Available: https://gitlab.com/max-centre/benchmarks/-/tree/master/Quantum_Espresso/CP/5R7Y-COVID19
- Supercomputer Raven at Max Plank Computing and Data Facility. [Online]. Available: https://www.mpcdf.mpg.de/services/supercomputing/raven
- U. Ayachit, A. Bauer, B. Geveci, P. O’Leary, K. Moreland, N. Fabian, and J. Mauldin, “Paraview catalyst: Enabling in situ data analysis and visualization,” in Proceedings of the First Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization, 2015, pp. 25–29.
- U. Ayachit, B. Whitlock, M. Wolf, B. Loring, B. Geveci, D. Lonie, and E. W. Bethel, “The sensei generic in situ interface,” in 2016 Second Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization (ISAV). IEEE, 2016, pp. 40–44.
- R. Car and M. Parrinello, “Unified approach for molecular dynamics and density-functional theory,” Physical review letters, vol. 55, no. 22, p. 2471, 1985.
- H. Childs, “Visit: An end-user tool for visualizing and analyzing very large data,” 2012.
- H. Childs, S. D. Ahern, J. Ahrens, A. C. Bauer, J. Bennett, E. W. Bethel, P.-T. Bremer, E. Brugger, J. Cottam, M. Dorier et al., “A terminology for in situ visualization and analysis systems,” The International Journal of High Performance Computing Applications, vol. 34, no. 6, pp. 676–691, 2020.
- M. Dorier, Z. Wang, U. Ayachit, S. Snyder, R. Ross, and M. Parashar, “Colza: Enabling elastic in situ visualization for high-performance computing simulations,” in 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2022, pp. 538–548.
- M. Dorier, Z. Wang, S. Ramesh, U. Ayachit, S. Snyder, R. Ross, and M. Parashar, “Towards elastic in situ analysis for high-performance computing simulations,” Journal of Parallel and Distributed Computing, vol. 177, pp. 106–116, 2023.
- J.-l. Gailly and M. Adler, “Zlib compression library,” 2004.
- A. Gainaru, L. Wan, R. Wang, E. Suchyta, J. Chen, N. Podhorszki, J. Kress, D. Pugmire, and S. Klasky, “Understanding the impact of data staging for coupled scientific workflows,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 12, pp. 4134–4147, 2022.
- P. Giannozzi, S. Baroni, N. Bonini, M. Calandra, R. Car, C. Cavazzoni, D. Ceresoli, G. L. Chiarotti, M. Cococcioni, I. Dabo et al., “Quantum espresso: a modular and open-source software project for quantum simulations of materials,” Journal of physics: Condensed matter, vol. 21, no. 39, p. 395502, 2009.
- W. F. Godoy, N. Podhorszki, R. Wang, C. Atkins, G. Eisenhauer, J. Gu, P. Davis, J. Choi, K. Germaschewski, K. Huck et al., “Adios 2: The adaptable input output system. a framework for high-performance data management,” SoftwareX, vol. 12, p. 100561, 2020.
- R. Hagan and Y. Cao, “Multi-gpu load balancing for in-situ visualization,” in Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA). The Steering Committee of The World Congress in Computer Science, Computer …, 2011, p. 1.
- N. Jansson, M. Karp, A. Podobas, S. Markidis, and P. Schlatter, “Neko: A modern, portable, and scalable framework for high-fidelity computational fluid dynamics,” arXiv preprint arXiv:2107.01243, 2021.
- Y. Ju, A. Perez, S. Markidis, P. Schlatter, and E. Laure, “Understanding the impact of synchronous, asynchronous, and hybrid in-situ techniques in computational fluid dynamics applications,” in 2022 IEEE 18th International Conference on e-Science (e-Science). IEEE, 2022, pp. 295–305.
- M. Karp, N. Jansson, A. Podobas, P. Schlatter, and S. Markidis, “Reducing communication in the conjugate gradient method: a case study on high-order finite elements,” in Proceedings of the Platform for Advanced Scientific Computing Conference, 2022, pp. 1–11.
- T. Kuhlen, R. Pajarola, and K. Zhou, “Parallel in situ coupling of simulation with a fully featured visualization system,” in Proceedings of the 11th Eurographics Conference on Parallel Graphics and Visualization (EGPGV), vol. 10. Eurographics Association Aire-la-Ville, Switzerland, 2011, pp. 101–109.
- Q. Liu, J. Logan, Y. Tian, H. Abbasi, N. Podhorszki, J. Y. Choi, S. Klasky, R. Tchoua, J. Lofstead, R. Oldfield et al., “Hello adios: the challenges and lessons of developing leadership class i/o frameworks,” Concurrency and Computation: Practice and Experience, vol. 26, no. 7, pp. 1453–1473, 2014.
- S. Mittal and J. S. Vetter, “A survey of cpu-gpu heterogeneous computing techniques,” ACM Computing Surveys (CSUR), vol. 47, no. 4, pp. 1–35, 2015.
- E. Otero, R. Vinuesa, O. Marin, E. Laure, and P. Schlatter, “Lossy data compression effects on wall-bounded turbulence: bounds on data reduction,” Flow, Turbulence and Combustion, vol. 101, no. 2, pp. 365–387, 2018.
- P. Qin, Z. Xia, G. Gao, X. Tao, L. Guo et al., “Gpu-based in situ visualization for large-scale discrete element simulations,” Wireless Communications and Mobile Computing, vol. 2022, 2022.
- S. Rezaeiravesh, R. Vinuesa, and P. Schlatter, “On numerical uncertainties in scale-resolving simulations of canonical wall turbulence,” Computers & Fluids, vol. 227, p. 105024, 2021.
- J. Seward, “bzip2 and libbzip2,” avaliable at http://www. bzip. org, 1996.
- L. Stanisic and K. Reuter, “Mpcdf hpc performance monitoring system: Enabling insight via job-specific analysis,” in Euro-Par 2019: Parallel Processing Workshops: Euro-Par 2019 International Workshops, Göttingen, Germany, August 26–30, 2019, Revised Selected Papers 25. Springer, 2020, pp. 613–625.
- G. K. Wallace, “The jpeg still picture compression standard,” IEEE transactions on consumer electronics, vol. 38, no. 1, pp. xviii–xxxiv, 1992.
- H. Xing, G. Agrawal, and R. Ramnath, “Gpu adaptive in-situ parallel analytics (gap),” in Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022, pp. 467–480.
- J. Ziv and A. Lempel, “A universal algorithm for sequential data compression,” IEEE Transactions on information theory, vol. 23, no. 3, pp. 337–343, 1977.