Guaranteed Nonconvex Factorization Approach for Tensor Train Recovery (2401.02592v2)
Abstract: In this paper, we provide the first convergence guarantee for the factorization approach. Specifically, to avoid the scaling ambiguity and to facilitate theoretical analysis, we optimize over the so-called left-orthogonal TT format which enforces orthonormality among most of the factors. To ensure the orthonormal structure, we utilize the Riemannian gradient descent (RGD) for optimizing those factors over the Stiefel manifold. We first delve into the TT factorization problem and establish the local linear convergence of RGD. Notably, the rate of convergence only experiences a linear decline as the tensor order increases. We then study the sensing problem that aims to recover a TT format tensor from linear measurements. Assuming the sensing operator satisfies the restricted isometry property (RIP), we show that with a proper initialization, which could be obtained through spectral initialization, RGD also converges to the ground-truth tensor at a linear rate. Furthermore, we expand our analysis to encompass scenarios involving Gaussian noise in the measurements. We prove that RGD can reliably recover the ground truth at a linear rate, with the recovery error exhibiting only polynomial growth in relation to the tensor order. We conduct various experiments to validate our theoretical findings.
- Tensor decompositions for signal processing applications: From two-way to multiway component analysis. IEEE signal processing magazine, 32(2):145–163, 2015.
- Tensor decomposition for signal processing and machine learning. IEEE Transactions on Signal Processing, 65(13):3551–3582, 2017.
- Blind PARAFAC receivers for DS-CDMA systems. IEEE Transactions on Signal Processing, 48(3):810–823, 2000.
- Multi-way analysis: applications in the chemical sciences. John Wiley & Sons, 2005.
- Unsupervised multiway data analysis: A literature survey. IEEE transactions on knowledge and data engineering, 21(1):6–20, 2008.
- Tensor decomposition for multiple-tissue gene expression experiments. Nature genetics, 48(9):1094–1100, 2016.
- Rasmus Bro. Parafac. Tutorial and applications. Chemometrics and intelligent laboratory systems, 38(2):149–171, 1997.
- Ledyard R Tucker. Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3):279–311, 1966.
- Ivan V Oseledets. Tensor-train decomposition. SIAM Journal on Scientific Computing, 33(5):2295–2317, 2011.
- Johan Håstad. Tensor rank is np-complete. In Automata, Languages and Programming: 16th International Colloquium Stresa, Italy, July 11–15, 1989 Proceedings 16, pages 451–460. Springer, 1989.
- Tensor rank and the ill-posedness of the best low-rank approximation problem. SIAM Journal on Matrix Analysis and Applications, 30(3):1084–1127, 2008.
- Tensor decompositions and applications. SIAM review, 51(3):455–500, 2009.
- Nonconvex low-rank tensor completion from noisy data. Advances in neural information processing systems, 32, 2019.
- Jose I Latorre. Image compression and entanglement. arXiv preprint quant-ph/0510031, 2005.
- Efficient tensor completion for color image and video recovery: Low-rank tensor train. IEEE Transactions on Image Processing, 26(5):2466–2479, 2017.
- Expressive power of recurrent neural networks. arXiv preprint arXiv:1711.00811, 2017.
- Supervised learning with tensor networks. Advances in neural information processing systems, 29, 2016.
- Tensorizing neural networks. Advances in neural information processing systems, 28, 2015.
- Tensor-train recurrent neural networks for video classification. In International Conference on Machine Learning, pages 3891–3900. PMLR, 2017.
- Compressing recurrent neural network with tensor train. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 4451–4458. IEEE, 2017.
- Long-term forecasting using tensor-train rnns. Arxiv, 2017.
- A tensorized transformer for language modeling. Advances in neural information processing systems, 32, 2019.
- Tensor methods and recommender systems. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 7(3):e1201, 2017.
- Tensor-train density estimation. In Uncertainty in artificial intelligence, pages 1321–1331. PMLR, 2021.
- Tensor train spectral method for learning of hidden markov models (hmm). Computational Methods in Applied Mathematics, 19(1):93–99, 2019.
- Matrix product states represent ground states faithfully. Physical review b, 73(9):094423, 2006.
- Matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems. Advances in physics, 57(2):143–224, 2008.
- Ulrich Schollwöck. The density-matrix renormalization group in the age of matrix product states. Annals of physics, 326(1):96–192, 2011.
- Efficient and feasible state tomography of quantum many-body systems. New Journal of Physics, 15(1):015024, 2013.
- Latent schatten tt norm for tensor completion. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2922–2926. IEEE, 2019.
- Tensor completion by alternating minimization under the tensor train (tt) model. arXiv preprint arXiv:1609.05587, 2016.
- High-order tensor completion via gradient-based optimization under tensor train format. Signal Processing: Image Communication, 73:53–61, 2019.
- Low rank tensor recovery via iterative hard thresholding. Linear Algebra and its Applications, 523:220–262, 2017.
- Tensor completion in hierarchical tensor representations. In Compressed sensing and its applications, pages 419–450. Springer, 2015.
- Tensor train completion: local recovery guarantees via riemannian optimization. arXiv preprint arXiv:2110.03975, 2021.
- Tensor completion using low-rank tensor train decomposition by riemannian optimization. In 2019 Chinese Automation Congress (CAC), pages 3380–3384. IEEE, 2019.
- Provable tensor-train format tensor completion by riemannian optimization. Journal of Machine Learning Research, 23(123):1–77, 2022.
- A tensor-train deep computation model for industry informatics big data feature learning. IEEE Transactions on Industrial Informatics, 14(7):3197–3204, 2018.
- Exploiting low-rank tensor-train deep neural networks based on riemannian gradient descent with illustrations of speech processing. arXiv preprint arXiv:2203.06031, 2022.
- Tensor ring decomposition with rank minimization on latent space: An efficient approach for tensor completion. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 9151–9158, 2019.
- Designing tensor-train deep neural networks for time-varying mimo channel estimation. IEEE Journal of Selected Topics in Signal Processing, 15(3):759–773, 2021.
- Quantum state tomography with tensor train cross approximation. arXiv preprint arXiv:2207.06397, 2022.
- Low-rank solutions of linear matrix equations via procrustes flow. In International Conference on Machine Learning, pages 964–973. PMLR, 2016.
- Non-square matrix sensing without spurious local minima via the burer-monteiro approach. In Artificial Intelligence and Statistics, pages 65–74. PMLR, 2017.
- Global optimality in low-rank matrix optimization. IEEE Transactions on Signal Processing, 66(13):3614–3628, 2018.
- An optimal statistical and computational framework for generalized tensor estimation. The Annals of Statistics, 50(1):1–29, 2022.
- Stable tomography for structured quantum states. arXiv preprint arXiv:2306.09432, 2023.
- On manifolds of tensors of fixed tt-rank. Numerische Mathematik, 120(4):701–731, 2012.
- Weakly convex optimization over Stiefel manifold using Riemannian subgradient-type methods. SIAM Journal on Optimization, 31(3):1605–1634, 2021.
- Optimization landscape of neural networks. Mathematical Aspects of Deep Learning, page 200, 2022.
- A unified computational and statistical framework for nonconvex low-rank matrix estimation. In Artificial Intelligence and Statistics, pages 981–990. PMLR, 2017.
- How to escape saddle points efficiently. In International conference on machine learning, pages 1724–1732. PMLR, 2017.
- Nonconvex robust low-rank matrix recovery. SIAM Journal on Optimization, 30(1):660–686, 2020.
- Beyond procrustes: Balancing-free gradient descent for asymmetric low-rank matrix sensing. IEEE Transactions on Signal Processing, 69:867–877, 2021.
- Accelerating ill-conditioned low-rank matrix estimation via scaled gradient descent. J. Mach. Learn. Res., 22:150–1, 2021.
- Scaling and scalability: Provable nonconvex low-rank tensor estimation from incomplete measurements. arXiv preprint arXiv:2104.14526, Nov. 2021.
- On polynomial time methods for exact low-rank tensor completion. Foundations of Computational Mathematics, 19(6):1265–1313, 2019.
- Tensor regression with applications in neuroimaging data analysis. Journal of the American Statistical Association, 108(502):540–552, 2013.
- Parsimonious tensor response regression. Journal of the American Statistical Association, 112(519):1131–1146, 2017.
- Tensor learning for regression. IEEE Transactions on Image Processing, 21(2):816–827, 2011.
- Sparse and low-rank tensor estimation via cubic sketchings. In International Conference on Artificial Intelligence and Statistics, pages 1319–1330. PMLR, 2020.
- David L Donoho. Compressed sensing. IEEE Transactions on information theory, 52(4):1289–1306, 2006.
- Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on information theory, 52(2):489–509, 2006.
- An introduction to compressive sampling. IEEE signal processing magazine, 25(2):21–30, 2008.
- Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM review, 52(3):471–501, 2010.
- Iterative hard thresholding for low cp-rank tensor models. Linear and Multilinear Algebra, pages 1–17, 2021.
- Modulus of continuity of some conditionally sub-gaussian fields, application to stable random fields. Bernoulli, 21(3):1719–1759, 2015.
- Phase transitions of spectral initialization for high-dimensional non-convex estimation. Information and Inference: A Journal of the IMA, 9(3):507–541, 2020.
- Phase retrieval via wirtinger flow: Theory and algorithms. IEEE Transactions on Information Theory, 61(4):1985–2007, 2015.
- Optimal spectral initialization for signal recovery with applications to phase retrieval. IEEE Transactions on Signal Processing, 67(9):2347–2356, 2019.
- Andrzej Cichocki. Tensor networks for big data analytics and large-scale optimization problems. arXiv preprint arXiv:1407.3124, 2014.
- Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction. Advances in Neural Information Processing Systems, 34:23831–23843, 2021.
- Algorithmic regularization in model-free overparametrized asymmetric matrix factorization. arXiv preprint arXiv:2203.02839, 2022.
- A validation approach to over-parameterized matrix and image recovery. arXiv preprint arXiv:2209.10675, 2022.
- The power of preconditioning in overparameterized low-rank matrix sensing. arXiv preprint arXiv:2302.01186, 2023.
- The global optimization geometry of low-rank matrix optimization. IEEE Transactions on Information Theory, 67(2):1308–1331, 2021.
- Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Transactions on Information Theory, 57(4):2342–2359, 2011.
- Preconditioned gradient descent for over-parameterized nonconvex matrix factorization. Advances in Neural Information Processing Systems, 34:5985–5996, 2021.
- Tian Tong. Scaled gradient methods for ill-conditioned low-rank matrix and tensor estimation. PhD thesis, Carnegie Mellon University, 2022.