Using Mixed Precision in Low-Synchronization Reorthogonalized Block Classical Gram-Schmidt (2210.08839v1)
Abstract: Using lower precision in algorithms can be beneficial in terms of reducing both computation and communication costs. Motivated by this, we aim to further the state-of-the-art in developing and analyzing mixed precision variants of iterative methods. In this work, we focus on the block variant of low-synchronization classical Gram-Schmidt with reorthogonalization, which we call BCGSI+LS. We demonstrate that the loss of orthogonality produced by this orthogonalization scheme can exceed $O(u)\kappa(\mathcal{X})$, where $u$ is the unit roundoff and $\kappa(\mathcal{X})$ is the condition number of the matrix to be orthogonalized, and thus we can not in general expect this to result in a backward stable block GMRES implementation. We then develop a mixed precision variant of this algorithm, called BCGSI+LS-MP, which uses higher precision in certain parts of the computation. We demonstrate experimentally that for a number of challenging test problems, our mixed precision variant successfully maintains a loss of orthogonality below $O(u)\kappa(\mathcal{X})$. This indicates that we can achieve a backward stable block GMRES algorithm that requires only one synchronization per iteration.