PhotoNs-GPU:A GPU accelerated cosmological simulation code (2107.14008v1)
Abstract: We present a GPU-accelerated cosmological simulation code, PhotoNs-GPU, based on algorithm of Particle Mesh Fast Multipole Method (PM-FMM), and focus on the GPU utilization and optimization. A proper interpolated method for truncated gravity is introduced to speed up the special functions in kernels. We verify the GPU code in mixed precision and different levels of interpolated method on GPU. A run with single precision is roughly two times faster that double precision for current practical cosmological simulations. But it could induce a unbiased small noise in power spectrum. Comparing with the CPU version of PhotoNs and Gadget-2, the efficiency of new code is significantly improved. Activated all the optimizations on the memory access, kernel functions and concurrency management, the peak performance of our test runs achieves 48% of the theoretical speed and the average performance approaches to 35% on GPU.