A multiscale cavity method for sublinear-rank symmetric matrix factorization (2403.07189v1)
Abstract: We consider a statistical model for symmetric matrix factorization with additive Gaussian noise in the high-dimensional regime where the rank $M$ of the signal matrix to infer scales with its size $N$ as $M = o(N{1/10})$. Allowing for a $N$-dependent rank offers new challenges and requires new methods. Working in the Bayesian-optimal setting, we show that whenever the signal has i.i.d. entries the limiting mutual information between signal and data is given by a variational formula involving a rank-one replica symmetric potential. In other words, from the information-theoretic perspective, the case of a (slowly) growing rank is the same as when $M = 1$ (namely, the standard spiked Wigner model). The proof is primarily based on a novel multiscale cavity method allowing for growing rank along with some information-theoretic identities on worst noise for the Gaussian vector channel. We believe that the cavity method developed here will play a role in the analysis of a broader class of inference and spin models where the degrees of freedom are large arrays instead of vectors.
- Y. Deshpande and A. Montanari, “Information-theoretically optimal sparse PCA,” in 2014 IEEE International Symposium on Information Theory (ISIT). IEEE, 2014, pp. 2197–2201.
- T. Lesieur, F. Krzakala, and L. Zdeborová, “Phase transitions in sparse PCA,” in 2015 IEEE International Symposium on Information Theory (ISIT). IEEE, 2015, pp. 1635–1639.
- Y. Deshpande, E. Abbe, and A. Montanari, “Asymptotic mutual information for the balanced binary stochastic block model,” Information and Inference: A Journal of the IMA, vol. 6, no. 2, pp. 125–170, 2017.
- E. Abbe, “Community detection and stochastic block models: Recent developments,” Journal of Machine Learning Research, vol. 18, no. 177, pp. 1–86, 2018.
- Y. Chen and J. Xu, “Statistical-computational tradeoffs in planted problems and submatrix localization with a growing number of clusters and submatrices,” Journal of Machine Learning Research, vol. 17, no. 27, pp. 1–57, 2016.
- B. Hajek, Y. Wu, and J. Xu, “Submatrix localization via message passing,” Journal of Machine Learning Research, vol. 18, no. 186, pp. 1–52, 2018.
- I. M. Johnstone, “On the distribution of the largest eigenvalue in principal components analysis,” Annals of Statistics, vol. 29, no. 2, pp. 295–327, 2001.
- J. Baik, G. Ben Arous, and S. Péché, “Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices,” Annals of Probability, vol. 33, no. 5, pp. 1643–1697, 2005.
- S. Péché, “The largest eigenvalue of small rank perturbations of hermitian random matrices,” Probability Theory and Related Fields, vol. 134, no. 1, pp. 127–173, 2006.
- T. Lesieur, F. Krzakala, and L. Zdeborová, “MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel,” in 53rd Annual Allerton Conference on Communication, Control, and Computing. IEEE, 2015, pp. 680–687.
- F. Krzakala, J. Xu, and L. Zdeborová, “Mutual information in rank-one matrix estimation,” in 2016 IEEE Information Theory Workshop (ITW). IEEE, 2016, pp. 71–75.
- A. Guionnet, J. Ko, F. Krzakala, and L. Zdeborová, “Low-rank matrix estimation with inhomogeneous noise,” arXiv preprint arXiv:2208.05918, 2022.
- A. Guionnet, J. Ko, F. Krzakala, and L. Zdeborová, “Estimating rank-one matrices with mismatched prior and noise: universality and large deviations,” arXiv preprint arXiv:2306.09283, 2023.
- M. Lelarge and L. Miolane, “Fundamental limits of symmetric low-rank matrix estimation,” Probability Theory and Related Fields, vol. 173, no. 3, pp. 859–929, 2019.
- J. Barbier, “Overlap matrix concentration in optimal bayesian inference,” Information and Inference: A Journal of the IMA, vol. 10, no. 2, pp. 597–623, 2021.
- J. Barbier and D. Panchenko, “Strong replica symmetry in high-dimensional optimal bayesian inference,” Communications in mathematical physics, vol. 393, no. 3, pp. 1199–1239, 2022.
- S. B. Korada and N. Macris, “Exact solution of the gauge symmetric p-spin glass model on a complete graph,” Journal of Statistical Physics, vol. 136, pp. 205–230, 2009.
- J. Barbier, M. Dia, N. Macris, F. Krzakala, T. Lesieur, and L. Zdeborová, “Mutual information for symmetric rank-one matrix estimation: A proof of the replica formula,” in Advances in Neural Information Processing Systems 29. NIPS, 2016, pp. 424–432.
- T. Lesieur, L. Miolane, M. Lelarge, F. Krzakala, and L. Zdeborová, “Statistical and computational phase transitions in spiked tensor estimation,” in 2017 IEEE International Symposium on Information Theory (ISIT). IEEE, 2017, pp. 511–515.
- C. Luneau, J. Barbier, and N. Macris, “Mutual information for low-rank even-order symmetric tensor estimation,” Information and Inference: A Journal of the IMA, vol. 10, no. 4, pp. 1167–1207, 2021.
- G. Reeves, “Information-theoretic limits for the matrix tensor product,” IEEE Journal on Selected Areas in Information Theory, vol. 1, no. 3, pp. 777–798, 2020.
- J. Husson and J. Ko, “Spherical integrals of sublinear rank,” arXiv preprint arXiv:2208.03642, 2023.
- A. Maillard, F. Krzakala, M. Mézard, and L. Zdeborová, “Perturbative construction of mean-field equations in extensive-rank matrix factorization and denoising,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2022, no. 8, p. 083301, 2022.
- A. Bodin and N. Macris, “Gradient flow on extensive-rank positive semi-definite matrix denoising,” arXiv preprint arXiv:2303.09474, 2023.
- F. Pourkamali, J. Barbier, and N. Macris, “Matrix inference in growing rank regimes,” arXiv preprint arXiv:2306.01412, 2023.
- D. Panchenko, “Free energy in the mixed p𝑝pitalic_p-spin models with vector spins,” Ann. Probab., vol. 46, no. 2, pp. 865–896, 2018.
- ——, “Free energy in the Potts spin glass,” Ann. Probab., vol. 46, no. 2, pp. 829–864, 2018.
- J. Ko, “Free energy of multiple systems of spherical spin glasses with constrained overlaps,” Electron. J. Probab., vol. 25, pp. Paper No. 28, 34, 2020.
- ——, “The Crisanti-Sommers Formula for Spherical Spin Glasses with Vector Spins,” arXiv preprint arXiv:1911.04355, 2019.
- T. Dominguez, “The ℓpsuperscriptℓ𝑝\ell^{p}roman_ℓ start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT-Gaussian-Grothendieck problem with vector spins,” Electron. J. Probab., vol. 27, pp. Paper No. 70, 46, 2022.
- D. Belius, L. Fröber, and J. Ko, “TAP variational principle for the constrained overlap multiple spherical Sherrington-Kirkpatrick model,” arXiv preprint arXiv:2304.04031, 2023.
- H.-B. Chen and J.-C. Mourrat, “On the free energy of vector spin glasses with non-convex interactions,” arXiv preprint arXiv:2311.08980, 2023.
- J.-C. Mourrat, “Free energy upper bound for mean-field vector spin glasses,” Ann. Inst. Henri Poincaré Probab. Stat., vol. 59, no. 3, pp. 1143–1182, 2023.
- H.-B. Chen, “On the self-overlap in vector spin glasses,” arXiv preprint arXiv:2311.09880, 2023.
- ——, “Parisi PDE and convexity for vector spins,” arXiv preprint arXiv:2311.10446, 2023.
- E. Bates and Y. Sohn, “Parisi formula for balanced Potts spin glass,” arXiv preprint arXiv:2310.06745, 2023.
- H.-B. Chen, “On Parisi measures of Potts spin glasses with correction,” arXiv preprint arXiv:2311.11699, 2023.
- A. Auffinger and Y. Zhou, “On properties of the spherical mixed vector p𝑝pitalic_p-spin model,” Stochastic Process. Appl., vol. 146, pp. 382–413, 2022.
- M. Aizenman, R. Sims, and S. L. Starr, “Extended variational principle for the Sherrington-Kirkpatrick spin-glass model,” Phys. Rev. B, vol. 68, p. 214403, Dec 2003.
- S. N. Diggavi and T. M. Cover, “Is maximum entropy noise the worst?” in 1997 IEEE International Symposium on Information Theory (ISIT). IEEE, 1997, p. 278.
- ——, “The worst additive noise under a covariance constraint,” IEEE Transactions on Information Theory, vol. 47, no. 7, pp. 3072–3081, 2001.
- I. Shomorony and A. S. Avestimehr, “Is Gaussian noise the worst-case additive noise in wireless networks?” in 2012 IEEE International Symposium on Information Theory (ISIT). IEEE, 2012, pp. 214–218.
- D. Guo, S. Shamai, and S. Verdu, “Mutual information and minimum mean-square error in Gaussian channels,” IEEE Transactions on Information Theory, vol. 51, no. 4, pp. 1261–1262, 2005.
- L. Miolane, “Phase transitions in spiked matrix estimation: information-theoretic analysis,” arXiv preprint arXiv:1806.04343, 2019.
- J. Huang, “Mesoscopic perturbations of large random matrices,” Random Matrices: Theory and Applications, vol. 7, no. 02, p. 1850004, 2018.
- G. Reeves, H. D. Pfister, and A. Dytso, “Mutual Information as a Function of Matrix SNR for Linear Gaussian Channels,” in 2018 IEEE International Symposium on Information Theory (ISIT). IEEE, 2018, pp. 1754–1758.
- A. Wibisono and V. Jog, “Convexity of mutual information along the heat flow,” in 2018 IEEE International Symposium on Information Theory (ISIT). IEEE, 2018, pp. 1615–1619.
- S. Tarmoun, G. Franca, B. D. Haeffele, and R. Vidal, “Understanding the dynamics of gradient flow in overparameterized linear models,” in Proceedings of the 38th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. Meila and T. Zhang, Eds., vol. 139. PMLR, 18–24 Jul 2021, pp. 10 153–10 161. [Online]. Available: https://proceedings.mlr.press/v139/tarmoun21a.html
- J. Barbier and N. Macris, “Statistical limits of dictionary learning: Random matrix theory and the spectral replica method,” Phys. Rev. E, vol. 106, no. 2, p. 024136, 2022.
- F. Camilli and M. Mézard, “Matrix factorization with neural networks,” Physical Review E, vol. 107, no. 6, p. 064308, 2023.
- ——, “The decimation scheme for symmetric matrix factorization,” Journal of Physics A: Mathematical and Theoretical, vol. 57, no. 8, p. 085002, 2024.
- A. Montanari and E. Richard, “A statistical model for tensor PCA,” in Advances in Neural Information Processing Systems 27. NIPS, 2014, pp. 2987–2905.
- A. Montanari and Y. Wu, “Fundamental Limits of Low-Rank Matrix Estimation with Diverging Aspect Ratios,” arXiv preprint arXiv:2211.00488, 2022.
- D. L. Donoho and M. J. Feldman, “Optimal Eigenvalue Shrinkage in the Semicircle Limit,” arXiv preprint arXiv:2210.04488, 2023.
- M. J. Feldman, “Spiked Singular Values and Vectors under Extreme Aspect Ratios,” Journal of Multivariate Analysis, vol. 196, p. 105187, 2023.