Parallel tree algorithms for AMR and non-standard data access (1803.08432v3)
Abstract: We introduce several parallel algorithms operating on a distributed forest of adaptive quadtrees/octrees. They are targeted at large-scale applications relying on data layouts that are more complex than required for standard finite elements, such as hp-adaptive Galerkin methods, particle tracking and semi-Lagrangian schemes, and in-situ post-processing and visualization. Specifically, we design algorithms to derive an adapted worker forest based on sparse data, to identify owner processes in a top-down search of remote objects, and to allow for variable process counts and per-element data sizes in partitioning and parallel file I/O. We demonstrate the algorithms' usability and performance in the context of a particle tracking example that we scale to 21e9 particles and 64Ki MPI processes on the Juqueen supercomputer, and we describe the construction of a parallel assembly of variably sized spheres in space creating up to 768e9 elements on the Juwels supercomputer.