Do algorithms and barriers for sparse principal component analysis extend to other structured settings? (2307.13535v2)
Abstract: We study a principal component analysis problem under the spiked Wishart model in which the structure in the signal is captured by a class of union-of-subspace models. This general class includes vanilla sparse PCA as well as its variants with graph sparsity. With the goal of studying these problems under a unified statistical and computational lens, we establish fundamental limits that depend on the geometry of the problem instance, and show that a natural projected power method exhibits local convergence to the statistically near-optimal neighborhood of the solution. We complement these results with end-to-end analyses of two important special cases given by path and tree sparsity in a general basis, showing initialization methods and matching evidence of computational hardness. Overall, our results indicate that several of the phenomena observed for vanilla sparse PCA extend in a natural fashion to its structured counterparts.
- Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences, 97(18):10101–10106, 2000.
- High-dimensional analysis of semidefinite relaxations for sparse principal components. In 2008 IEEE international symposium on information theory, pages 2454–2458. IEEE, 2008.
- Stay on path: PCA along graph paths. In International Conference on Machine Learning, pages 1728–1736. PMLR, 2015.
- Computational hardness of certifying bounds on constrained pca problems. arXiv preprint arXiv:1902.07324, 2019.
- Model-based compressive sensing. IEEE Transactions on information theory, 56(4):1982–2001, 2010.
- Q. Berthet and P. Rigollet. Complexity theoretic lower bounds for sparse principal component detection. In Conference on learning theory, pages 1046–1066. PMLR, 2013.
- Learning from general label constraints. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pages 671–679. Springer, 2004.
- Minimax bounds for sparse PCA with noisy high-dimensional data. Annals of statistics, 41(3):1055, 2013.
- M. Brennan and G. Bresler. Optimal average-case reductions to sparse PCA: From weak assumptions to strong hardness. In Conference on Learning Theory, pages 469–470. PMLR, 2019.
- M. Brennan and G. Bresler. Reducibility and statistical-computational gaps from secret leakage. In Conference on Learning Theory, pages 648–847. PMLR, 2020.
- Reducibility and computational lower bounds for problems with planted sparse structure. In Conference On Learning Theory, pages 48–166. PMLR, 2018.
- J. Cadima and I. T. Jolliffe. Loading and correlations in the interpretation of principle compenents. Journal of applied Statistics, 22(2):203–214, 1995.
- Sparse PCA: Optimal rates and adaptive estimation. The Annals of Statistics, 41(6):3074–3110, 2013.
- Optimal structured principal subspace estimation: Metric entropy and minimax rates. J. Mach. Learn. Res., 22:46–1, 2021.
- C. Cartis and A. Thompson. An exact tree projection algorithm for wavelets. IEEE Signal Processing Letters, 20(11):1026–1029, 2013.
- An alternating manifold proximal gradient method for sparse principal component analysis and sparse canonical correlation analysis. INFORMS Journal on Optimization, 2(3):192–208, 2020.
- A direct formulation for sparse PCA using semidefinite programming. Advances in neural information processing systems, 17, 2004.
- Y. Deshpande and A. Montanari. Sparse PCA via covariance thresholding. The Journal of Machine Learning Research, 17(1):4913–4953, 2016.
- Cone-constrained principal component analysis. Advances in Neural Information Processing Systems, 27, 2014.
- Solving row-sparse principal component analysis via convex integer programs. arXiv preprint arXiv:2010.11152, 2020.
- Using l1-relaxation and integer programming to obtain dual bounds for sparse PCA. Operations Research, 2021.
- Subexponential-time algorithms for sparse PCA. arXiv preprint arXiv:1907.11635, 2019.
- Approximation bounds for sparse principal component analysis. Mathematical Programming, 148(1):89–110, 2014.
- Sparse principal component analysis via variable projection. SIAM Journal on Applied Mathematics, 80(2):977–1002, 2020.
- Supervised discriminative sparse PCA for com-characteristic gene selection and tumor classification on multiview biological data. IEEE transactions on neural networks and learning systems, 30(10):2926–2937, 2019.
- Sparse tensor dimensionality reduction with application to clustering of functional connectivity. In Wavelets and Sparsity XVIII, volume 11138, page 111380N. International Society for Optics and Photonics, 2019.
- Sparse CCA: Adaptive estimation and computational barriers. The Annals of Statistics, 45(5):2074–2101, 2017.
- Face processing: Human perception and principal components analysis. Memory & cognition, 24(1):26–40, 1996.
- ’gene shaving’as a method for identifying distinct sets of genes with similar expression patterns. Genome biology, 1(2):1–21, 2000.
- The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009.
- A nearly-linear time framework for graph-structured sparsity. In International Conference on Machine Learning, pages 928–937. PMLR, 2015.
- On consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association, 104(486):682–693, 2009.
- A modified principal component technique based on the lasso. Journal of computational and Graphical Statistics, 12(3):531–547, 2003.
- Generalized power method for sparse principal component analysis. Journal of Machine Learning Research, 11(2), 2010.
- Convexification of permutation-invariant sets and applications. Scanning Electron Microsc Meet at, 2019.
- Do semidefinite relaxations solve sparse PCA up to the information limit? The Annals of Statistics, 43(3):1300–1322, 2015.
- Y. Li and W. Xie. Exact and approximation algorithms for sparse PCA. arXiv preprint arXiv:2008.12438, 2020.
- Generative principal component analysis. In International Conference on Learning Representations, 2021.
- T. Ma and A. Wigderson. Sum-of-squares lower bounds for sparse PCA. arXiv preprint arXiv:1507.06370, 2015.
- Z. Ma. Sparse principal component analysis and iterative thresholding. The Annals of Statistics, 41(2):772–801, 2013.
- L. Mackey. Deflation methods for sparse PCA. Advances in neural information processing systems, 21, 2008.
- S. Mallat. A wavelet tour of signal processing. Elsevier, 1999.
- How robust are reconstruction thresholds for community detection? In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, pages 828–841, 2016.
- A. Montanari and E. Richard. Non-negative principal component analysis: Message passing algorithms and sharp asymptotics. IEEE Transactions on Information Theory, 62(3):1458–1484, 2015.
- R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58(1):267–288, 1996.
- Tensor sparse PCA and face recognition: a novel approach. SN Applied Sciences, 2(7):1–7, 2020.
- S. Verdú. Generalizing the Fano inequality. IEEE Transactions on Information Theory, 40(4):1247–1251, 1994.
- R. Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018.
- V. Vu and J. Lei. Minimax rates of estimation for sparse PCA in high dimensions. In Artificial intelligence and statistics, pages 1278–1286. PMLR, 2012.
- V. Q. Vu and J. Lei. Minimax sparse principal subspace estimation in high dimensions. The Annals of Statistics, 41(6):2905–2947, 2013.
- Fantope projection and selection: A near-optimal convex relaxation of sparse PCA. Advances in neural information processing systems, 26, 2013.
- M. J. Wainwright. High-dimensional statistics: A non-asymptotic viewpoint, volume 48. Cambridge University Press, 2019.
- Statistical and computational trade-offs in estimation of sparse principal components. The Annals of Statistics, 44(5):1896–1930, 2016.
- A manifold proximal linear method for sparse spectral clustering with application to single-cell rna sequencing data analysis. INFORMS Journal on Optimization, 2021.
- A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics, 10(3):515–534, 2009.
- Y. Yang and A. Barron. Information-theoretic determination of minimax rates of convergence. Annals of Statistics, pages 1564–1599, 1999.
- Y. Yi and M. Neykov. Non-sparse PCA in high dimensions via cone projected power iteration. arXiv preprint arXiv:2005.07587, 2020.
- B. Yu. Assouad, Fano, and Le Cam. In Festschrift for Lucien Le Cam, pages 423–435. Springer, 1997.
- M. Yuan and Y. Lin. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(1):49–67, 2006.
- X.-T. Yuan and T. Zhang. Truncated power method for sparse eigenvalue problems. Journal of Machine Learning Research, 14(4), 2013.
- L. Zdeborová and F. Krzakala. Statistical physics of inference: Thresholds and algorithms. Advances in Physics, 65(5):453–552, 2016.
- Sparse PCA: Convex relaxations, algorithms and applications. In Handbook on Semidefinite, Conic and Polynomial Optimization, pages 915–940. Springer, 2012.
- H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the royal statistical society: series B (statistical methodology), 67(2):301–320, 2005.
- Sparse principal component analysis. Journal of computational and graphical statistics, 15(2):265–286, 2006.