On the accuracy and performance of the lattice Boltzmann method with 64-bit, 32-bit and novel 16-bit number formats (2112.08926v2)

Published 16 Dec 2021 in physics.comp-ph, cond-mat.stat-mech, cs.DC, physics.bio-ph, and physics.flu-dyn

Abstract: Fluid dynamics simulations with the lattice Boltzmann method (LBM) are very memory-intensive. Alongside reduction in memory footprint, significant performance benefits can be achieved by using FP32 (single) precision compared to FP64 (double) precision, especially on GPUs. Here, we evaluate the possibility to use even FP16 and Posit16 (half) precision for storing fluid populations, while still carrying arithmetic operations in FP32. For this, we first show that the commonly occurring number range in the LBM is a lot smaller than the FP16 number range. Based on this observation, we develop novel 16-bit formats - based on a modified IEEE-754 and on a modified Posit standard - that are specifically tailored to the needs of the LBM. We then carry out an in-depth characterization of LBM accuracy for six different test systems with increasing complexity: Poiseuille flow, Taylor-Green vortices, Karman vortex streets, lid-driven cavity, a microcapsule in shear flow (utilizing the immersed-boundary method) and finally the impact of a raindrop (based on a Volume-of-Fluid approach). We find that the difference in accuracy between FP64 and FP32 is negligible in almost all cases, and that for a large number of cases even 16-bit is sufficient. Finally, we provide a detailed performance analysis of all precision levels on a large number of hardware microarchitectures and show that significant speedup is achieved with mixed FP32/16-bit.

Citations (13)

View on Semantic Scholar

Summary

The paper investigates how 64-bit, 32-bit, and novel 16-bit precision formats affect LBM accuracy and performance, finding reduced precision often suffices for many fluid dynamics simulations.
Using custom 16-bit formats with FP32 arithmetic offers significant speedups and ~45% memory reduction compared to FP32/32-bit, enabling more complex simulations.
Reduced precision and mixed precision techniques significantly improve efficiency, memory use, and leverage GPU capabilities, enabling larger LBM simulations.

Overview of Lattice Boltzmann Method Precision Analysis

This paper presents a comprehensive analysis of the impact of varying floating-point precision levels on the accuracy and performance of the lattice Boltzmann method (LBM), focusing particularly on 64-bit, 32-bit, and novel 16-bit formats. The objective is to mitigate the memory-intensive nature of fluid dynamics simulations using LBM through precision reduction, especially leveraging GPUs for enhanced computational efficiency.

Key Results and Contributions

The authors initially establish that the typical number range used in LBM simulations is smaller than the standard FP16 (half precision) range. Based on this insight, they develop novel number formats, including a modified IEEE-754 and a modified Posit16 standard, customized for LBMs. These formats aim to reduce memory usage while maintaining accurate simulation results.

The paper evaluates these precision formats across six test systems with increasing complexity: Poiseuille flow, Taylor-Green vortices, Karman vortex streets, lid-driven cavity, a microcapsule in shear flow, and the impact of a raindrop. The analyses reveal that the difference in accuracy between FP64 and FP32 is negligible in most cases. Moreover, the customized 16-bit formats often suffice for numerous scenarios, implying that mixed precision approaches (e.g., FP32 arithmetic with 16-bit data storage) can achieve significantly increased computational speed without substantial loss of accuracy.

Numerical Results

The paper illustrates that mixed FP32/16-bit precision results in significant speedups across a variety of hardware architectures. This comes with a stark reduction in memory footprint—approximately 45% when transitioning from FP32/32-bit to FP32/16-bit. Notably, in the context of GPU computing, utilizing 16-bit formats accompanied by FP32 arithmetic operations leverages the hardware's memory bandwidth constraints effectively.

Furthermore, in all tested setups, 16-bit floating-point precision was often adequate, reiterating the potential for performance gains without sacrificing accuracy. The authors provide evidence that custom 16-bit Posit formats perform exceptionally well, at times rivaling FP32 accuracy, especially when calculations are done in FP32 and stored in Posit16.

Practical and Theoretical Implications

From a practical standpoint, the reduction in memory usage and improvements in computational speed could enable more complex simulations on extant hardware or allow larger domains to be simulated within the same computational resources. The paper underlines the importance of optimizing numerical precision in fluid dynamics simulations to exploit the full potential of modern computing architectures, especially GPUs, which exhibit higher peak speeds at lower precision levels.

Theoretically, the paper suggests that further customization of number formats, possibly adapted to other computational frameworks or domain-specific applications, could provide a generalized strategy for managing precision versus performance trade-offs.

Future Directions

The research opens several avenues for future exploration. The development of hardware supporting custom floating-point operations, such as those seen in the Posit family, could greatly enhance the feasibility of mixed precision strategies for complex computational tasks. Moreover, extending these analyses to other simulation methods and domains within computational physics could verify the broader applicability of these precision reduction strategies.

In summary, the paper provides a detailed and methodical examination of LBM simulations under varying precision constraints, delivering insights into the trade-offs between precision, accuracy, and computational efficiency. By demonstrating that reduced precision formats can be effective, it paves the way for more resource-efficient simulations, thereby broadening the practical scope and applications of LBM in scientific and engineering research.

Related Papers

YouTube

Show All Videos