Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PittPack: An Open-Source Poisson's Equation Solver for Extreme-Scale Computing with Accelerators (1909.05423v1)

Published 12 Sep 2019 in physics.comp-ph and cs.MS

Abstract: We present a parallel implementation of a direct solver for the Poisson's equation on extreme-scale supercomputers with accelerators. We introduce a chunked-pencil decomposition as the domain-decomposition strategy to distribute work among processing elements to achieve superior scalability at large number of accelerators. Chunked-pencil decomposition enables overlapping nodal communication and data transfer between the central processing units (CPUs) and the graphics processing units (GPUs). Second, it improves data locality by keeping neighboring elements in adjacent memory locations. Third, it allows usage of shared-memory for certain segments of the algorithm when possible, and last but not least, it enables contiguous message transfer among the nodes. Two different communication patterns are designed. The fist pattern aims to fully overlap the communication with data transfer and designed for speedup of overall turnaround time, whereas the second method concentrates on low memory usage and is more network friendly for computations at extreme scale. To ensure software portability, we interleave OpenACC with MPI in the software. The numerical solution and its formal second order of accuracy is verified using method of manufactured solutions for various combinations of boundary conditions. Weak scaling analysis is performed using up to 1.1 trillion Cartesian mesh points using 16384 GPUs on a petascale leadership class supercomputer.

Citations (2)

Summary

We haven't generated a summary for this paper yet.