Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

On the backward stability of s-step GMRES (2409.03079v3)

Published 4 Sep 2024 in math.NA and cs.NA

Abstract: Communication, i.e., data movement, is a critical bottleneck for the performance of classical Krylov subspace method solvers on modern computer architectures. Variants of these methods which avoid communication have been introduced, which, while equivalent in exact arithmetic, can be unstable in finite precision. In this work, we address the backward stability of $s$-step GMRES, also known as communication-avoiding GMRES. Compared to the ``modular framework'' proposed in [A.~Buttari, N.~J.~Higham, T.~Mary, & B.~Vieubl\'e. Preprint in 2024.], we present an improved framework for simplifying the analysis of $s$-step GMRES, which includes standard GMRES ($s=1$) as a special case, by isolating the effects of rounding errors in the QR factorization and the solution of the least squares problem. The key advantage of this new framework is that it is evident how the orthogonalization method affects the backward error, and it is not necessary to re-evaluate anything other than the orthogonalization itself when modifying the orthogonalization used in GMRES. Using this framework, we analyze $s$-step GMRES with popular block orthogonalization methods: block modified Gram--Schmidt and reorthogonalized block classical Gram--Schmidt algorithms. An example illustrates the resulting instability of $s$-step GMRES when paired with the classical $s$-step Arnoldi process and shows the limitations of popular strategies for resolving this instability. To address this issue, we propose a modified $s$-step Arnoldi process that allows for much larger block size $s$ while maintaining satisfactory accuracy, as confirmed by our numerical experiments.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. A Newton basis GMRES implementation. IMA J. Numer. Anal., 14(4):563–581, 1994. doi:10.1093/imanum/14.4.563.
  2. J. L. Barlow. Block modified Gram–Schmidt algorithms and their analysis. SIAM J. Matrix Anal. Appl., 40(4):1257–1290, 2019. doi:10.1137/18M1197400.
  3. B. Beckermann. On the numerical condition of polynomial bases: estimates for the condition number of Vandermonde, Krylov and Hankel matrices. PhD thesis, Verlag nicht ermittelbar, 1997. URL: https://math.univ-lille1.fr/~bbecker/abstract/Habilitationsschrift_Beckermann.pdf.
  4. Å. Björck and C. C. Paige. Loss and recapture of orthogonality in the modified Gram–Schmidt algorithm. SIAM J. Matrix Anal. Appl., 13(1):176–190, 1992. doi:10.1137/0613015.
  5. A modular framework for the backward error analysis of GMRES. 2024. URL: https://hal.science/hal-04525918/file/preprint.pdf.
  6. Towards understanding CG and GMRES through examples. Linear Algebra Appl., 692:241–291, 2024. doi:10.1016/j.laa.2024.04.003.
  7. On the loss of orthogonality of low-synchronization variants of reorthogonalized block classical Gram–Schmidt, 2024. arXiv:2408.10109.
  8. Reorthogonalized Pythagorean variants of block classical Gram–Schmidt, 2024. arXiv:2405.01298.
  9. E. C. Carson. Communication-Avoiding Krylov Subspace Methods in Theory and Practice. PhD thesis, Department of Computer Science, University of California, Berkeley, 2015. URL: http://escholarship.org/uc/item/6r91c407.
  10. Block Gram-Schmidt algorithms and their stability properties. Linear Algebra Appl., 638(20):150–195, 2022. doi:10.1016/j.laa.2021.12.017.
  11. Reducing the effect of global communication in GMRES(m) and CG on parallel distributed memory computers. Appl. Numer. Math., 18(4):441–459, 1995. doi:10.1016/0168-9274(95)00079-A.
  12. Numerical stability of GMRES. BIT, 35(3):309–330, 1995. doi:10.1007/BF01732607.
  13. N. J. Higham. Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, PA, USA, 2nd edition, 2002.
  14. M. Hoemmen. Communication-Avoiding Krylov Subspace Methods. PhD thesis, Department of Computer Science, University of California at Berkeley, 2010. URL: http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-37.pdf.
  15. D. Imberti and J. Erhel. Varying the s in your s-step GMRES. Electron. T. Numer. Ana., 47:206–230, 2017. doi:10.1553/etna_vol47s206.
  16. W. Jalby and B. Philippe. Stability analysis and improvement of the block Gram–Schmidt algorithm. SIAM J. Sci. Stat. Comput., 12(5):1058–1073, 1991. doi:10.1137/0912056.
  17. Parallelizable restarted iterative methods for nonsymmetric linear systems. II: parallel implementation. Int. J. Comput. Math., 44(1-4):269–290, 1992. doi:doi.org/10.1080/00207169208804108.
  18. Parallelizable restarted iterative methods for nonsymmetric linear systems. Part I: Theory. Int. J. Comput. Math., 44(1-4):243–267, 1992. doi:10.1080/00207169208804107.
  19. Modified Gram–Schmidt (MGS), least squares, and backward stability of MGS-GMRES. SIAM J. Matrix Anal. Appl., 28(1):264–284, 2006. doi:10.1137/050630416.
  20. Y. Saad and M. H. Schultz. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 7(3):856–869, 1986. doi:10.1137/0907058.
  21. H. F. Walker. Implementation of the GMRES method using Householder transformations. SIAM J. Sci. Stat. Comput., 9(1):152–163, 1988. doi:10.1137/0909010.
  22. Improving the performance of CA-GMRES on multicores with multiple GPUs. In 28th International Parallel and Distributed Processing Symposium, pages 382–391. IEEE, 2014. doi:10.1109/IPDPS.2014.48.
  23. Rounding error analysis of mixed precision block Householder QR algorithms. SIAM J. Sci. Comput., 43(3):A1723–A1753, 2021. doi:10.1137/19M1296367.
Citations (2)

Summary

We haven't generated a summary for this paper yet.