Algorithm xxx: Faster Randomized SVD with Dynamic Shifts (2404.09276v1)
Abstract: Aiming to provide a faster and convenient truncated SVD algorithm for large sparse matrices from real applications (i.e. for computing a few of largest singular values and the corresponding singular vectors), a dynamically shifted power iteration technique is applied to improve the accuracy of the randomized SVD method. This results in a dynamic shifts based randomized SVD (dashSVD) algorithm, which also collaborates with the skills for handling sparse matrices. An accuracy-control mechanism is included in the dashSVD algorithm to approximately monitor the per vector error bound of computed singular vectors with negligible overhead. Experiments on real-world data validate that the dashSVD algorithm largely improves the accuracy of randomized SVD algorithm or attains same accuracy with fewer passes over the matrix, and provides an efficient accuracy-control mechanism to the randomized SVD computation, while demonstrating the advantages on runtime and parallel efficiency. A bound of the approximation error of the randomized SVD with the shifted power iteration is also proved.
- 2021. Aminer. https://www.aminer.cn.
- 2021. Intel oneAPI Math Kernel Library. https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl.html.
- 2021. RandQR. https://users.oden.utexas.edu/~pgm/Codes/randqb_codes_intel_mkl.zip.
- 2022. frPCA_sparse. https://github.com/XuFengthucs/frPCA_sparse.
- 2022. primme. https://github.com/primme/primme.
- Zeyuan Allen-Zhu and Yuanzhi Li. 2016. LazySVD: Even faster SVD decomposition yet without agonizing pain. In Advances in Neural Information Processing Systems. 974–982.
- The OpenMP ARB. 2022. The OpenMP API specification for parallel programming. https://www.openmp.org/.
- James Baglama and Lothar Reichel. 2005. Augmented implicitly restarted Lanczos bidiagonalization methods. SIAM Journal on Scientific Computing 27, 1 (2005), 19–42.
- PETSc Web page. https://petsc.org/.
- Compressed singular value decomposition for image and video processing. In Proc. IEEE International Conference on Computer Vision (ICCV). 1880–1888.
- Layered Label Propagation: A MultiResolution Coordinate-Free Ordering for Compressing Social Networks. In Proc. the 20th international conference on World Wide Web. ACM Press, 587–596.
- Paolo Boldi and Sebastiano Vigna. 2004. The WebGraph Framework I: Compression Techniques. In Proc. the Thirteenth International World Wide Web Conference (WWW 2004). ACM Press, Manhattan, USA, 595–601.
- Timothy A Davis and Yifan Hu. 2011. The University of Florida sparse matrix collection. ACM Transactions on Mathematical Software (TOMS) 38, 1 (2011), 1–25.
- Efficient model-based collaborative filtering with fast adaptive PCA. In 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI). IEEE, 955–960.
- Carl Eckart and Gale Young. 1936. The approximation of one matrix by another of lower rank. Psychometrika 1, 3 (1936), 211–218.
- Fast randomized PCA for sparse data. In Proc. the 10th Asian Conference on Machine Learning (ACML). 710–725.
- Gene H Golub and Charles F Van Loan. 2012. Matrix Computations. JHU Press.
- Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev. 53, 2 (2011), 217–288.
- F. Maxwell Harper and Joseph A. Konstan. 2016. The Movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4 (2016), 19.
- Michael T Heath. 2018. Scientific Computing: An Introductory Survey, Revised Second Edition. SIAM.
- A robust and efficient parallel SVD solver based on restarted Lanczos bidiagonalization. Electronic Transactions on Numerical Analysis 31 (2008), 68–85.
- Roger A. Horn and Charles R. Johnson. 1991. Topics in Matrix Analysis. Cambridge University Press. https://doi.org/10.1017/CBO9780511840371
- Rasmus Munk Larsen. 2004. PROPACK-Software for large and sparse SVD calculations. Available online. http://sun.stanford.edu/~rmunk/PROPACK (2004).
- Jure Leskovec and Andrej Krevl. 2014. SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data.
- Algorithm 971: An implementation of a randomized algorithm for principal component analysis. ACM Trans. Math. Software 43, 3 (2017), 1–14.
- Michael W. Mahoney. 2011. Randomized algorithms for matrices and data. Foundations and Trends® in Machine Learning 3, 2 (2011), 123–224.
- Per-Gunnar Martinsson and Joel A. Tropp. 2020. Randomized numerical linear algebra: Foundations and algorithms. Acta Numerica 29 (2020), 403–572.
- P. G. Martinsson and S. Voronin. 2016. A randomized blocked algorithm for efficiently computing rank-revealing factorizations of matrices. SIAM J. Sci. Comput 38 (2016), S485––S507.
- Cameron Musco and Christopher Musco. 2015. Randomized block Krylov methods for stronger and faster approximate singular value decomposition. In Advances in Neural Information Processing Systems. 1396–1404.
- Block subsampled randomized Hadamard transform for Nyström approximation on distributed architectures. In Proc. International Conference on Machine Learning. 1564–1576.
- A randomized algorithm for principal component analysis. SIAM J. Matrix Anal. Appl. 31, 3 (2010), 1100–1124.
- Andreas Stathopoulos and James R McCombs. 2010. PRIMME: Preconditioned iterative multimethod eigensolver: Methods and software description. ACM Transactions on Mathematical Software (TOMS) 37, 2 (2010), 1–30.
- Sergey Voronin and Per-Gunnar Martinsson. 2015. RSVDPACK: An implementation of randomized algorithms for computing the singular value, interpolative, and CUR decompositions of matrices on multi-core and GPU architectures. arXiv preprint arXiv:1502.05366 (2015).
- PRIMME_SVDS: A high-performance preconditioned SVD solver for accurate large-scale computations. SIAM Journal on Scientific Computing 39, 5 (2017), S248–S271.