Massive parallelization and performance enhancement of an immersed boundary method based unsteady flow solver (2402.17337v1)
Abstract: High-fidelity simulations of unsteady fluid flow are now possible with advancements in high-performance computing hardware and software frameworks. Since computational fluid dynamics (CFD) computations are dominated by linear algebraic routines, they can be significantly accelerated through massive parallelization on graphics processing units (GPUs). Thus, GPU implementation of high-fidelity CFD solvers is essential in reducing the turnaround time for quicker design space exploration. In the present work, an immersed boundary method (IBM) based in-house flow solver has been ported to the GPU using OpenACC, a compiler directive-based heterogeneous parallel programming framework. Out of various GPU porting pathways available, OpenACC was chosen because of its minimum code intrusion, low development time, and striking similarity with OpenMP, a similar directive-based shared memory programming framework. A detailed validation study and performance analysis of the parallel solver implementations on the CPU and GPU are presented. The GPU implementation shows a speedup up to the order $O(10)$ over the CPU parallel version and up to the order $O(102)$ over the serial code. The GPU implementation also scales well with increasing mesh size owing to the efficient utilization of the GPU processor cores.
- Modelling thrust generation of a two-dimensional heaving airfoil in a viscous flow. Journal of Fluid Mechanics, 492:339, 2003.
- Bifurcations and route to chaos for flow over an oscillating airfoil. Journal of Fluids and Structures, 80:262–274, 2018.
- JCS Lai and MF Platzer. Jet characteristics of a plunging airfoil. AIAA journal, 37(12):1529–1537, 1999.
- Flapping and bending bodies interacting with fluid flows. Annual Review of Fluid Mechanics, 43:449–465, 2011.
- Vortex lock-in phenomenon in the wake of a plunging airfoil. AIAA journal, 45(2):485–490, 2007.
- Investigating chaotic wake dynamics past a flapping airfoil and the role of vortex interactions behind the chaotic transition. Physics of fluids, 30(4):047101, 2018.
- Capturing the dynamical transitions in the flow-field of a flapping foil using immersed boundary method. Journal of Fluids and Structures, 95:102999, 2020.
- A hierarchical parallel implementation for heterogeneous computing. application to algebra-based cfd simulations on hybrid supercomputers. Computers & Fluids, 214:104768.
- Adflow: an open-source computational fluid dynamics solver for aerodynamic and multidisciplinary optimization. Journal of Aerospace Information Systems, 17(9):508–527, 2020.
- Productivity, performance, and portability for computational fluid dynamics applications. Computers & Fluids, 199:104425, 2020.
- Sparse linear algebra toolkit for computational aerodynamics. In AIAA Scitech 2020 Forum, page 0317, 2020.
- Charles S Peskin. The immersed boundary method. Acta numerica, 11:479–517, 2002.
- Immersed boundary methods. Annu. Rev. Fluid Mech., 37:239–261, 2005.
- Thomas Wick. Solving monolithic fluid-structure interaction problems in arbitrary lagrangian eulerian coordinates with the deal. ii library. Archive of Numerical Software, 1(1):1–19, 2013.
- An immersed-boundary finite-volume method for simulations of flow in complex geometries. Journal of computational physics, 171(1):132–150, 2001.
- Performance enhancement of an immersed boundary method based fsi solver using openmp. In Annual CFD Symposium. NAL, Bangalore, 2019.
- Heterogeneous computing of cfd applications on cpu-gpu platforms using openacc directives. In AIAA Scitech 2020 Forum, page 1046, 2020.
- Maxim Naumov. Incomplete-lu and cholesky preconditioned iterative methods using cusparse and cublas. Nvidia white paper, 3, 2011.
- OpenACC for Programmers: Concepts and Strategies. Addison-Wesley Professional, 2017.
- Rob Farber. Parallel programming with OpenACC. Newnes, 2016.
- Cuda vs openacc: Performance case studies with kernel benchmarks and a memory-bound cfd application. In 2013 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, pages 136–143. IEEE, 2013.
- A comparison of performance tunabilities between opencl and openacc. In 2013 IEEE 7th International Symposium on Embedded Multicore Socs, pages 147–152. IEEE, 2013.
- Comparing programmer productivity in openacc and cuda: an empirical investigation. International Journal of Computer Science, Engineering and Applications (IJCSEA), 6(5):1–15, 2016.
- Effects of the computational time step on numerical solutions of turbulent flow. Journal of Computational Physics, 113(1):1–4, 1994.
- GM Amdahl. Validity of single-processor approach to achieving large-scale computing capability, proceedings of afips conference, reston, 1967.