Approximation of boundary element matrices using GPGPUs and nested cross approximation

Published 25 Oct 2015 in cs.MS | (1510.07244v2)

Abstract: The efficiency of boundary element methods depends crucially on the time required for setting up the stiffness matrix. The far-field part of the matrix can be approximated by compression schemes like the fast multipole method or $\mathcal{H}$-matrix techniques. The near-field part is typically approximated by special quadrature rules like the Sauter-Schwab technique that can handle the singular integrals appearing in the diagonal and near-diagonal matrix elements. Since computing one element of the matrix requires only a small amount of data but a fairly large number of operations, we propose to use general-purpose graphics processing units (GPGPUs) to handle vectorizable portions of the computation: near-field computations are ideally suited for vectorization and can therefore be handled very well by GPGPUs. Modern far-field compression schemes can be split into a small adaptive portion that exhibits divergent control flows, and should therefore be handled by the CPU, and a vectorizable portion that can again be sent to GPGPUs. We propose a hybrid algorithm that splits the computation into tasks for CPUs and GPGPUs. Our method presented in this article is able to reduce the setup time of boundary integral operators by a significant factor of 19-30 for both the Laplace and the Helmholtz equation in 3D when using two consumer GPGPUs compared to a quad-core CPU.