Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
124 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Analysis of Randomized Householder-Cholesky QR Factorization with Multisketching (2309.05868v3)

Published 11 Sep 2023 in math.NA and cs.NA

Abstract: CholeskyQR2 and shifted CholeskyQR3 are two state-of-the-art algorithms for computing tall-and-skinny QR factorizations since they attain high performance on current computer architectures. However, to guarantee stability, for some applications, CholeskyQR2 faces a prohibitive restriction on the condition number of the underlying matrix to factorize. Shifted CholeskyQR3 is stable but has $50\%$ more computational and communication costs than CholeskyQR2. In this paper, a randomized QR algorithm called Randomized Householder-Cholesky (\texttt{rand_cholQR}) is proposed and analyzed. Using one or two random sketch matrices, it is proved that with high probability, its orthogonality error is bounded by a constant of the order of unit roundoff for any numerically full-rank matrix, and hence it is as stable as shifted CholeskyQR3. An evaluation of the performance of \texttt{rand_cholQR} on a NVIDIA A100 GPU demonstrates that for tall-and-skinny matrices, \texttt{rand_cholQR} with multiple sketch matrices is nearly as fast as, or in some cases faster than, CholeskyQR2. Hence, compared to CholeskyQR2, \texttt{rand_cholQR} is more stable with almost no extra computational or memory cost, and therefore a superior algorithm both in theory and practice.

Citations (6)

Summary

We haven't generated a summary for this paper yet.