Compressive Recovery of Sparse Precision Matrices (2311.04673v3)
Abstract: We consider the problem of learning a graph modeling the statistical relations of the $d$ variables from a dataset with $n$ samples $X \in \mathbb{R}{n \times d}$. Standard approaches amount to searching for a precision matrix $\Theta$ representative of a Gaussian graphical model that adequately explains the data. However, most maximum likelihood-based estimators usually require storing the $d{2}$ values of the empirical covariance matrix, which can become prohibitive in a high-dimensional setting. In this work, we adopt a compressive viewpoint and aim to estimate a sparse $\Theta$ from a \emph{sketch} of the data, i.e. a low-dimensional vector of size $m \ll d{2}$ carefully designed from $X$ using non-linear random features. Under certain assumptions on the spectrum of $\Theta$ (or its condition number), we show that it is possible to estimate it from a sketch of size $m=\Omega\left((d+2k)\log(d)\right)$ where $k$ is the maximal number of edges of the underlying graph. These information-theoretic guarantees are inspired by compressed sensing theory and involve restricted isometry properties and instance optimal decoders. We investigate the possibility of achieving practical recovery with an iterative algorithm based on the graphical lasso, viewed as a specific denoiser. We compare our approach and graphical lasso on synthetic datasets, demonstrating its favorable performance even when the dataset is compressed.
- Sketching for simultaneously sparse and low-rank covariance matrices. In 2015 IEEE 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), pages 357–360, 2015.
- Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. Journal of Machine Learning Research, 9(15):485–516, 2008.
- A descent lemma beyond lipschitz gradient continuity: first-order methods revisited and applications. Mathematics of Operations Research, 42(2):330–348, 2017.
- Joint minimization with alternating bregman proximity operators. Pacific Journal of Optimization, 2006.
- Regularizing with bregman–moreau envelopes. SIAM Journal on Optimization, 28(4):3208–3228, 2018.
- Amir Beck. First-Order Methods in Optimization. SIAM-Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2017.
- Large-scale sparse inverse covariance matrix estimation. SIAM Journal on Scientific Computing, 41(1), 2019.
- Nicolas Boumal. An introduction to optimization on smooth manifolds. Cambridge University Press, 2023.
- Fundamental performance limits for ideal decoders in high-dimensional linear inverse problems. IEEE Transactions on Information Theory, pages 7928–7946, December 2014.
- Convex Optimization. Cambridge University Press, March 2004.
- Lev M Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR computational mathematics and mathematical physics, 7(3):200–217, 1967.
- Rop: Matrix recovery via rank-one projections. The Annals of Statistics, 43(1), Feb 2015.
- A constrained l1 minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association, 106(494):594–607, 2011.
- Exact matrix completion via convex optimization. Commun. ACM, 55(6):111–119, jun 2012.
- Decoding by linear programming. IEEE transactions on information theory, 51(12):4203–4215, 2005.
- Proximal minimization algorithm with d-functions. Journal of Optimization Theory and Applications, 73(3):451–464, 1992.
- Antoine Chatalic. Efficient and privacy-preserving compressive learning. Theses, Université Rennes 1, November 2020.
- Large-Scale High-Dimensional Clustering with Fast Sketching. In ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, pages 4714–4718, Calgary, Canada, April 2018. IEEE.
- Exact and stable covariance estimation from quadratic sampling via convex programming. IEEE Transactions on Information Theory, 61(7):4034–4059, 2015.
- Kenneth L. Clarkson. Tighter bounds for random projections of manifolds. In Proceedings of the Twenty-Fourth Annual Symposium on Computational Geometry, SCG ’08, page 39–48, New York, NY, USA, 2008. Association for Computing Machinery.
- It has potential: Gradient-driven denoisers for convergent solutions to inverse problems. Advances in Neural Information Processing Systems (NeurIPS), 34:18152–18164, 2021.
- Regularization by denoising via fixed-point projection (red-pro). SIAM Journal on Imaging Sciences, 14(3):1374–1406, 2021.
- R. Coleman. Calculus on Normed Vector Spaces. Universitext. Springer New York, 2012.
- Rop inception: signal estimation with quadratic random sketching. In 30th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2022.
- Arthur P Dempster. Covariance selection. Biometrics, pages 157–175, 1972.
- Graph learning from data under laplacian and structural constraints. IEEE Journal of Selected Topics in Signal Processing, 11(6):825–841, 2017.
- P. Erdös and A. Rényi. On random graphs i. Publicationes Mathematicae Debrecen, 6, 1959.
- Network exploration via the adaptive LASSO and SCAD penalties. The Annals of Applied Statistics, 3(2):521 – 541, 2009.
- Fino and Algazi. Unified matrix treatment of the fast walsh-hadamard transform. IEEE Transactions on Computers, C-25(11):1142–1146, 1976.
- A Mathematical Introduction to Compressive Sensing. 2013.
- Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9(3):432–441, July 2008.
- G. Frusque. Inférence et décomposition modale de réseaux dynamiques en neurosciences. PhD thesis, Université de Lyon, December 2020.
- Compressive Statistical Learning with Random Feature Moments. Mathematical Statistics and Learning, 3, 2021.
- Statistical Learning Guarantees for Compressive Clustering and Compressive Mixture Modeling. Mathematical Statistics and Learning, 3, 2021.
- Sketching Data Sets for Large-Scale Learning: Keeping only what you need. IEEE Signal Processing Magazine, 38(5):12–36, September 2021.
- A characterization of proximity operators. Journal of Mathematical Imaging and Vision, 62(6-7):773–789, 2020.
- Exploring network structure, dynamics, and function using networkx. Technical report, Los Alamos National Lab.(LANL), Los Alamos, NM (United States), 2008.
- Network inference via the time-varying graphical lasso. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’17, page 205–213, New York, NY, USA, 2017. Association for Computing Machinery.
- On riemannian optimization over positive definite matrices with the bures-wasserstein geometry. Advances in Neural Information Processing Systems (NeurIPS), 34:8940–8953, 2021.
- Array programming with NumPy. Nature, 585(7825):357–362, sep 2020.
- Quic: Quadratic approximation for sparse inverse covariance estimation. Journal of Machine Learning Research, 15(83):2911–2947, 2014.
- Big & quic: Sparse inverse covariance estimation for a million variables. In C.J. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc., 2013.
- J. D. Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007.
- Convergent bregman plug-and-play image restoration for poisson inverse problems. arXiv preprint arXiv:2306.03466, 2023.
- Proximal denoiser for convergent plug-and-play optimization with nonconvex regularization. In International Conference on Machine Learning (ICML), pages 9483–9505. PMLR, 2022.
- Phase retrieval: An overview of recent developments. Optical Compressive Imaging, pages 279–312, 2016.
- Stable low-rank matrix recovery via null space properties. Information and Inference: A Journal of the IMA, 5(4):405–441, 2016.
- Instance optimal decoding and the restricted isometry property. In Journal of Physics: Conference Series, volume 1131, page 012002. IOP Publishing, 2018.
- Low rank matrix recovery from rank one measurements. Applied and Computational Harmonic Analysis, 42(1):88–116, 2017.
- Precision matrix estimation with rope. Journal of Computational and Graphical Statistics, 26(3):682–694, 2017.
- A unified framework for structured graph learning via spectral constraints. Journal of Machine Learning Research, 21(22):1–60, 2020.
- Tim Tsz-Kit Lau and Han Liu. Bregman proximal langevin monte carlo via bregman-moreau envelopes. In International Conference on Machine Learning, pages 12049–12077. PMLR, 2022.
- Steffen L Lauritzen. Graphical models, volume 17. Clarendon Press, 1996.
- Fastfood-approximating kernel expansions in loglinear time. In Proceedings of the international conference on machine learning, volume 85, page 8, 2013.
- Low-rank positive semidefinite matrix recovery from corrupted rank-one measurements. IEEE Transactions on Signal Processing, 65(2):397–408, 2017.
- Tailored graphical lasso for data integration in gene network reconstruction. BMC bioinformatics, 22(1):1–22, 2021.
- Weidong Liu and Xi Luo. Fast and adaptive sparse precision matrix estimation in high dimensions. Journal of Multivariate Analysis, 135:153–162, 2015.
- Learning partial correlation graphs and graphical models by covariance queries. Journal of Machine Learning Research, 22(203):1–41, 2021.
- The graphical lasso: New insights and alternatives. Electronic Journal of Statistics, 6(none):2125 – 2149, 2012.
- Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
- J-C Pesquet. Proximal approaches for matrix optimization problems: Application to robust precision matrix estimation. Signal Processing, 169:107417, 2020.
- Recipes for stable linear embeddings from Hilbert spaces to ℝmsuperscriptℝ𝑚\mathbb{R}^{m}blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT. IEEE Transactions on Information Theory, 2017. Submitted in 2015.
- Sidus—the solution for extreme deduplication of an operating system. Linux Journal, 2013(235):3, 2013.
- High-dimensional covariance estimation by minimizing l1-penalized log-determinant divergence. Electronic Journal of Statistics, 5:935 – 980, 2011.
- James C. Robinson. Dimensions, Embeddings, and Attractors. Cambridge Tracts in Mathematics. Cambridge University Press, 2010.
- Iterative thresholding algorithm for sparse inverse covariance estimation. Advances in Neural Information Processing Systems, 25, 2012.
- Plug-and-play methods provably converge with properly trained denoisers. In International Conference on Machine Learning, pages 5546–5557. PMLR, 2019.
- Random projections through multiple optical scattering: Approximating kernels at the speed of light. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 6215–6219. IEEE, 2016.
- Precision matrix estimation under the horseshoe-like prior-penalty dual, 2021.
- Regularization-free estimation in trace regression with symmetric positive semidefinite matrices. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015.
- Graphical nonconvex optimization via an adaptive convex relaxation. In International Conference on Machine Learning, volume 80, pages 4810–4817, 2018.
- Marc Teboulle. A simplified view of first order methods for optimization. Mathematical Programming, 170(1):67–96, 2018.
- Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices, page 210–268. Cambridge University Press, 2012.
- Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020.
- Martin J. Wainwright. High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019.
- Robust Gaussian graphical model estimation with arbitrary corruption. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 3617–3626. PMLR, 06–11 Aug 2017.
- Precision matrix estimation in high dimensional gaussian graphical models with faster rates. In Arthur Gretton and Christian C. Robert, editors, Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, volume 51 of Proceedings of Machine Learning Research, pages 177–185, Cadiz, Spain, 09–11 May 2016. PMLR.
- Hyenkyun Woo. A characterization of the domain of beta-divergence and its connection to bregman variational model. Entropy, 19(9):482, 2017.
- Speeding up latent variable gaussian graphical model estimation via nonconvex optimization. In Advances in Neural Information Processing Systems, volume 30, 2017.
- Elementary estimators for graphical models. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014.
- Sparse methods for biomedical data. SIGKDD Explor. Newsl., 14(1):4–15, dec 2012.
- Nonconvex sparse graph learning under laplacian constrained graphical model. In Advances in Neural Information Processing Systems, volume 33, pages 7101–7113, 2020.
- Orthogonal random features. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc., 2016.
- Ming Yuan. High dimensional inverse covariance matrix estimation via linear programming. Journal of Machine Learning Research, 11(79):2261–2286, 2010.
- Ming Yuan and Yi Lin. Model selection and estimation in the gaussian graphical model. Biometrika, 94(1):19–35, 2007.
- Sparse precision matrix estimation via lasso penalized D-trace loss. Biometrika, 101(1):103–120, 02 2014.
- Incorporating prior biological knowledge for network-based differential gene expression analysis using differentially weighted graphical lasso. BMC bioinformatics, 18(1):1–14, 2017.