Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fast randomized numerical rank estimation for numerically low-rank matrices (2105.07388v2)

Published 16 May 2021 in math.NA and cs.NA

Abstract: Matrices with low-rank structure are ubiquitous in scientific computing. Choosing an appropriate rank is a key step in many computational algorithms that exploit low-rank structure. However, estimating the rank has been done largely in an ad-hoc fashion in large-scale settings. In this work we develop a randomized algorithm for estimating the numerical rank of a (numerically low-rank) matrix. The algorithm is based on sketching the matrix with random matrices from both left and right; the key fact is that with high probability, the sketches preserve the orders of magnitude of the leading singular values. We prove a result on the accuracy of the sketched singular values and show that gaps in the spectrum are detected. For an $m\times n$ $(m\geq n)$ matrix of numerical rank $r$, the algorithm runs with complexity $O(mn\log n+r3)$, or less for structured matrices. The steps in the algorithm are required as a part of many low-rank algorithms, so the additional work required to estimate the rank can be even smaller in practice. Numerical experiments illustrate the speed and robustness of our rank estimator.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. A. Andoni and H. L. Nguyễn. Eigenvalues of a matrix in the streaming model. Proc. Annu. ACM-SIAM Symp. Discrete Algorithms, pages 1729–1737, 2013.
  2. G. Aubrun and S. J. Szarek. Alice and Bob meet Banach, volume 223. American Mathematical Society, 2017.
  3. Blendenpik: Supercharging LAPACK’s least-squares solver. SIAM J. Sci. Comp., 32(3):1217–1236, 2010.
  4. Z. Bai and J. W. Silverstein. Spectral Analysis of Large Dimensional Random Matrices. Springer, 2010.
  5. Minimizing communication for eigenproblems and the singular value decomposition. Technical Report 237, LAPACK Working Note, 2010.
  6. B. Beckermann and A. Townsend. On the singular values of matrices with displacement structure. SIAM J. Matrix Anal. Appl., 38(4):1227–1248, 2017.
  7. C. Boutsidis and A. Gittens. Improved matrix algorithms via the subsampled randomized Hadamard transform. SIAM J. Matrix Anal. Appl., 34(3):1301–1340, 2013.
  8. E. J. Candès and B. Recht. Exact matrix completion via convex optimization. Found. Comput. Math., 9(6):717–772, 2009.
  9. Hashing embeddings of optimal dimension, with applications to linear least squares. arXiv:2105.11815, 2021.
  10. Fast matrix rank algorithms and applications. J. ACM, 60(5):1–25, 2013.
  11. Low-rank approximation and regression in input sparsity time. J. ACM, 63(6):54, 2017.
  12. M. B. Cohen. Nearly tight oblivious subspace embeddings by trace inequalities. In Proc. Annu. ACM-SIAM Symp. Discrete Algorithms, pages 278–287. SIAM, 2016.
  13. Local operator theory, random matrices and Banach spaces. In Handbook of the geometry of Banach spaces, pages 317–366. Elsevier, 2001.
  14. Efficient estimation of eigenvalue counts in an interval. Numer. Lin. Alg. Appl., 23(4):674–692, 2016.
  15. J. A. Duersch and M. Gu. Randomized projection for rank-revealing matrix factorizations and low-rank approximations. SIAM Rev., 62(3):661–682, 2020.
  16. N. El Karoui. Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Stat., 36(6):2757–2790, 2008.
  17. A. Gittens and J. A. Tropp. Tail bounds for all eigenvalues of a sum of random matrices. 1:1–23, 2011.
  18. Matrix Computations. The Johns Hopkins University Press, 4th edition, 2012.
  19. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev., 53(2):217–288, 2011.
  20. I. C. Ipsen and T. Wentworth. The effect of coherence on sampling from matrices with orthonormal columns, and preconditioned least squares problems. SIAM J. Matrix Anal. Appl., 35(4):1490–1520, 2014.
  21. I. M. Johnstone. On the distribution of the largest eigenvalue in principal components analysis. Ann. Stat., pages 295–327, 2001.
  22. D. M. Kane and J. Nelson. Sparser Johnson-Lindenstrauss transforms. J. ACM, 61(1):1–23, 2014.
  23. V. Koltchinskii and K. Lounici. Concentration inequalities and moment bounds for sample covariance operators. Bernoulli, 23(1):110–133, 2017.
  24. O. Ledoit and M. Wolf. A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar. Anal., 88(2):365–411, 2004.
  25. O. Ledoit and M. Wolf. Nonlinear shrinkage estimation of large-dimensional covariance matrices. Ann. Stat., 40(2):1024–1060, 2012.
  26. O. Ledoit and M. Wolf. Analytical nonlinear shrinkage of large-dimensional covariance matrices. Ann. Stat., 48(5):3043–3065, 2020.
  27. Approximating spectral densities of large matrices. SIAM Rev., 58(1):34–65, 2016.
  28. Distribution of eigenvalues for some sets of random matrices. Mathematics of the USSR-Sbornik, 1(4):457–483, 1967.
  29. P.-G. Martinsson. Randomized methods for matrix computations. arXiv preprint 1607.01649, 2016.
  30. P.-G. Martinsson. Fast Direct Solvers for Elliptic PDEs. SIAM, 2019.
  31. randUTV: A blocked randomized algorithm for computing a rank-revealing UTV factorization. ACM Trans. Math. Soft., 45(1):1–26, 2019.
  32. Randomized numerical linear algebra: Foundations and algorithms. Acta Numer., pages 403–572, 2020.
  33. R. Mathias. Two theorems on singular values and eigenvalues. Am. Math. Mon., 97(1):47–50, 1990.
  34. LSRN: A parallel iterative solver for strongly over-or underdetermined systems. SIAM J. Sci. Comp., 36(2):C95–C118, 2014.
  35. B. Nadler. Finite sample approximation results for principal component analysis: A matrix perturbation approach. Ann. Stat., 36(6):2791–2817, 2008.
  36. Y. Nakatsukasa. Fast and stable randomized low-rank matrix approximation. arXiv:2009.11392, 2020.
  37. Statistical Eigen-Inference from large Wishart matrices. Ann. Stat., 36(6):2850–2885, 2008.
  38. V. Rokhlin and M. Tygert. A fast randomized algorithm for overdetermined linear least-squares regression. Proc. Natl. Acad. Sci., 105(36):13212–13217, 2008.
  39. M. Rudelson and R. Vershynin. Non-asymptotic theory of random matrices: Extreme singular values. In Proc. ICM, pages 1576–1602, 2010.
  40. J. A. Tropp. Improved analysis of the subsampled randomized Hadamard transform. Advances in Adaptive Data Analysis, 3(1-2):115–126, 2011.
  41. Streaming low-rank matrix approximation with an application to scientific simulation. SIAM J. Sci. Comp., 41(4):A2430–A2463, 2019.
  42. S. Ubaru and Y. Saad. Fast methods for estimating the numerical rank of large matrices. In ICML, pages 468–477. PMLR, 2016.
  43. Fast estimation of approximate matrix ranks using spectral densities. Neural Comput., 29(5):1317–1351, may 2017.
  44. Find the dimension that counts: Fast dimension estimation and krylov pca. In Proc. SIAM Int. Conf. on Data Min., pages 720–728. SIAM, 2019.
  45. M. Udell and A. Townsend. Why are big data matrices approximately low rank? SIAM J. Math. Data Sci., 1(1):144–160, 2019.
  46. R. Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge University Press, 2018.
  47. M. J. Wainwright. High-dimensional statistics: A non-asymptotic viewpoint. Cambridge University Press, 2019.
  48. D. P. Woodruff. Sketching as a tool for numerical linear algebra. Found. Trends Theor. Comput. Sci., 10(1–2):1–157, 2014.
  49. Efficient randomized algorithms for the fixed-precision low-rank matrix approximation. SIAM J. Matrix Anal. Appl., 39(3):1339–1359, 2018.
  50. Distributed estimation of generalized matrix rank: Efficient algorithms and lower bounds. In ICML, pages 457–465. PMLR, 2015.
Citations (10)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets