Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Experience and Analysis of Scalable High-Fidelity Computational Fluid Dynamics on Modular Supercomputing Architectures (2405.05640v1)

Published 9 May 2024 in cs.DC, cs.MS, and physics.flu-dyn

Abstract: The never-ending computational demand from simulations of turbulence makes computational fluid dynamics (CFD) a prime application use case for current and future exascale systems. High-order finite element methods, such as the spectral element method, have been gaining traction as they offer high performance on both multicore CPUs and modern GPU-based accelerators. In this work, we assess how high-fidelity CFD using the spectral element method can exploit the modular supercomputing architecture at scale through domain partitioning, where the computational domain is split between a Booster module powered by GPUs and a Cluster module with conventional CPU nodes. We investigate several different flow cases and computer systems based on the modular supercomputing architecture (MSA). We observe that for our simulations, the communication overhead and load balancing issues incurred by incorporating different computing architectures are seldom worthwhile, especially when I/O is also considered, but when the simulation at hand requires more than the combined global memory on the GPUs, utilizing additional CPUs to increase the available memory can be fruitful. We support our results with a simple performance model to assess when running across modules might be beneficial. As MSA is becoming more widespread and efforts to increase system utilization are growing more important our results give insight into when and how a monolithic application can utilize and spread out to more than one module and obtain a faster time to solution.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Parallel Computing 108: 102841.
  2. arXiv preprint arXiv:1505.07630 .
  3. Arima E, Comprés AI and Schulz M (2022) On the convergence of malleability and the hpc powerstack: exploiting dynamism in over-provisioned and power-constrained hpc systems. In: International Conference on High Performance Computing. Springer, pp. 206–217.
  4. Future Generation Computer Systems 107: 31–48.
  5. The International Journal of High Performance Computing Applications 33(1): 124–139.
  6. Cambridge university press.
  7. Flow, turbulence and combustion 91(3): 475–495.
  8. The International Journal of High Performance Computing Applications 34(5): 562–586.
  9. Fischer PF (2015) Scaling limits for PDE-based simulation. In: 22nd AIAA Computational Fluid Dynamics Conference. p. 3049.
  10. Fischer PF, Lottes JW and Kerkemeier SG (2008) nek5000 Web page. http://nek5000.mcs.anl.gov.
  11. In: 2015 IEEE International Conference on Cluster Computing. IEEE, pp. 760–767.
  12. Proceedings of the National Academy of Sciences 117(14): 7594–7598.
  13. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. pp. 1–9.
  14. arXiv preprint arXiv:2107.01243 .
  15. In: 2023 IEEE 19th International Conference on e-Science (e-Science). IEEE, pp. 1–10.
  16. Karniadakis GE, Israeli M and Orszag SA (1991) High-order splitting methods for the incompressible Navier-Stokes equations. Journal of computational physics 97(2): 414–443.
  17. In: Proceedings of the Platform for Advanced Scientific Computing Conference. pp. 1–11.
  18. arXiv preprint arXiv:2207.07098 .
  19. Karypis G, Schloegel K and Kumar V (2003) Parmetis. Parallel graph partitioning and sparse matrix ordering library. Version 2.
  20. ACM SIGARCH Computer Architecture News 36(3): 77–88.
  21. The International Journal of High Performance Computing Applications 35(6): 527–552.
  22. Computers & Fluids 166: 1–8.
  23. Krause D (2019) JUWELS: Modular tier-0/1 supercomputer at the Jülich supercomputing centre. Journal of large-scale research facilities JLSRF 5: A135–A135.
  24. Krause D and Thörnig P (2018) JURECA: modular supercomputer at Jülich supercomputing centre. Journal of large-scale research facilities JLSRF 4: A132–A132.
  25. 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Vancouver (Canada), 21 May 2018 - 25 May 2018, IEEE, pp. 69 – 78. 10.1109/IPDPSW.2018.00019. URL https://juser.fz-juelich.de/record/851724.
  26. Technical report, Jülich Supercomputing Center.
  27. Liu X, Zhong Z and Xu K (2016) A hybrid solution method for cfd applications on gpu-accelerated hybrid hpc platforms. Future Generation Computer Systems 56: 759–765.
  28. Markov S, Petkov P and Pavlov V (2019) Large-scale molecular dynamics simulations on modular supercomputer architecture with gromacs. In: International conference on Variability of the Sun and sun-like stars: from asteroseismology to space weather. Springer, pp. 359–367.
  29. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. pp. 1–11.
  30. Niemeyer KE and Sung CJ (2014) Recent progress and challenges in exploiting graphics processors in computational fluid dynamics. The Journal of Supercomputing 67: 528–564.
  31. In: Proceedings of the Exascale Applications and Software Conference 2016. pp. 1–10.
  32. In: 2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, pp. 76–85.
  33. In: 2017 IEEE 3rd International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB). IEEE, pp. 1–8.
  34. Technical report.
  35. Suarez E, Eicker N and Lippert T (2019) Modular supercomputing architecture: from idea to production. In: Contemporary high performance computing. CRC Press, pp. 223–255.
  36. Tufo HM and Fischer PF (1999) Terascale spectral element algorithms and implementations. In: Proceedings of the 1999 ACM/IEEE Conference on Supercomputing. pp. 68–81.
  37. Journal of Computational Physics 230(8): 2794–2805.
  38. Witherden FD, Farrington AM and Vincent PE (2014) Pyfr: An open source framework for solving advection–diffusion type problems on streaming architectures using the flux reconstruction approach. Computer Physics Communications 185(11): 3028–3040.
  39. Zhong Z, Rychkov V and Lastovetsky A (2014) Data partitioning on multicore and multi-GPU platforms using functional performance models. IEEE Transactions on Computers 64(9): 2506–2518.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com