Skew-Symmetric Matrix Decompositions on Shared-Memory Architectures (2411.09859v2)
Abstract: The factorization of skew-symmetric matrices is a critically understudied area of dense linear algebra, particularly in comparison to that of general and symmetric matrices. While some algorithms can be adapted from the symmetric case, the cost of algorithms can be reduced by exploiting skew-symmetry. This work examines the factorization of a skew-symmetric matrix $X$ into its $LTL\mathrm{T}$ decomposition, where $L$ is unit lower triangular and $T$ is tridiagonal. This is also known as a triangular tridiagonalization. This operation is a means for computing the determinant of $X$ as the square of the (cheaply-computed) Pfaffian of the skew-symmetric tridiagonal matrix $T$ as well as for solving systems of equations, across fields such as quantum electronic structure and machine learning. Its application also often requires pivoting in order to improve numerical stability. We compare and contrast previously-published algorithms with those systematically derived using the FLAME methodology. Performant parallel CPU implementations are achieved by fusing operations at multiple levels in order to reduce memory traffic overhead. A key factor is the employment of new capabilities of the BLAS-like Library Instantiation Software (BLIS) framework, which now supports casting level-2 and level-3 BLAS-like operations by leveraging its gemm and other kernels, hierarchical parallelism, and cache blocking. A prototype, concise C++ API facilitates the translation of correct-by-construction algorithms into correct code. Experiments verify that the resulting implementations greatly exceed the performance of previous work.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.