The Adaptive $s$-step Conjugate Gradient Method (1701.03989v1)
Abstract: On modern large-scale parallel computers, the performance of Krylov subspace iterative methods is limited by global synchronization. This has inspired the development of $s$-step Krylov subspace method variants, in which iterations are computed in blocks of $s$, which can reduce the number of global synchronizations per iteration by a factor of $O(s)$. Although the $s$-step variants are mathematically equivalent to their classical counterparts, they can behave quite differently in finite precision depending on the parameter $s$. If $s$ is chosen too large, the $s$-step method can suffer a convergence delay and a decrease in attainable accuracy relative to the classical method. This makes it difficult for a potential user of such methods - the $s$ value that minimizes the time per iteration may not be the best $s$ for minimizing the overall time-to-solution, and further may cause an unacceptable decrease in accuracy. Towards improving the reliability and usability of $s$-step Krylov subspace methods, in this work we derive the \emph{adaptive $s$-step CG method}, a variable $s$-step CG method where in block $k$, the parameter $s_k$ is determined automatically such that a user-specified accuracy is attainable. The method for determining $s_k$ is based on a bound on growth of the residual gap within block $k$, from which we derive a constraint on the condition numbers of the computed $O(s_k)$-dimensional Krylov subspace bases. The computations required for determining the block size $s_k$ can be performed without increasing the number of global synchronizations per block. Our numerical experiments demonstrate that the adaptive $s$-step CG method is able to attain up to the same accuracy as classical CG while still significantly reducing the total number of global synchronizations.