Low coordinate degree algorithms I: Universality of computational thresholds for hypothesis testing (2403.07862v1)
Abstract: We study when low coordinate degree functions (LCDF) -- linear combinations of functions depending on small subsets of entries of a vector -- can hypothesis test between high-dimensional probability measures. These functions are a generalization, proposed in Hopkins' 2018 thesis but seldom studied since, of low degree polynomials (LDP), a class widely used in recent literature as a proxy for all efficient algorithms for tasks in statistics and optimization. Instead of the orthogonal polynomial decompositions used in LDP calculations, our analysis of LCDF is based on the Efron-Stein or ANOVA decomposition, making it much more broadly applicable. By way of illustration, we prove channel universality for the success of LCDF in testing for the presence of sufficiently "dilute" random signals through noisy channels: the efficacy of LCDF depends on the channel only through the scalar Fisher information for a class of channels including nearly arbitrary additive i.i.d. noise and nearly arbitrary exponential families. As applications, we extend lower bounds against LDP for spiked matrix and tensor models under additive Gaussian noise to lower bounds against LCDF under general noisy channels. We also give a simple and unified treatment of the effect of censoring models by erasing observations at random and of quantizing models by taking the sign of the observations. These results are the first computational lower bounds against any large class of algorithms for all of these models when the channel is not one of a few special cases, and thereby give the first substantial evidence for the universality of several statistical-to-computational gaps.
- Emmanuel Abbe. Community detection and stochastic block models: recent developments. The Journal of Machine Learning Research, 18(1):6446–6531, 2017.
- Algorithmic barriers from phase transitions. In 49th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2008), pages 793–802. IEEE, 2008.
- An introduction to random matrices. Cambridge University Press, 2010.
- The committee machine: Computational to statistical gaps in learning a two-layers neural network. Advances in Neural Information Processing Systems, 31, 2018.
- The Franz-Parisi criterion and computational trade-offs in high dimensional statistics. arXiv preprint arXiv:2205.09727, 2022.
- The landscape of the spiked tensor model. Communications on Pure and Applied Mathematics, 72(11):2282–2330, 2019.
- Optimal average-case reductions to sparse PCA: From weak assumptions to strong hardness. In 32nd Annual Conference on Learning Theory (COLT 2019), pages 469–470. PMLR, 2019.
- Reducibility and statistical-computational gaps from secret leakage. In 33rd Annual Conference on Learning Theory (COLT 2020), pages 648–847. PMLR, 2020.
- Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. The Annals of Probability, 33(5):1643–1697, 2005.
- Reducibility and computational lower bounds for problems with planted sparse structure. In 31st Annual Conference On Learning Theory (COLT 2018), pages 48–166. PMLR, 2018.
- Universality of computational lower bounds for submatrix detection. In Conference on Learning Theory, pages 417–468. PMLR, 2019.
- Statistical query algorithms and low-degree tests are almost equivalent. arXiv preprint arXiv:2009.06107, 2020.
- Spectral planting and the hardness of refuting cuts, colorability, and communities in random graphs. In 34th Annual Conference on Learning Theory (COLT 2021), pages 410–473. PMLR, 2021.
- Algorithmic thresholds for tensor PCA. Annals of Probability, 48(4):2052–2087, 2020.
- The algorithmic phase transition of random k𝑘kitalic_k-SAT for low degree polynomials. In 2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS), pages 298–309. IEEE, 2022.
- A nearly tight sum-of-squares lower bound for the planted clique problem. SIAM Journal on Computing, 48(2):687–735, 2019.
- Computational hardness of certifying bounds on constrained PCA problems. In 11th Innovations in Theoretical Computer Science Conference (ITCS 2020), volume 151, pages 78:1–78:29, 2020.
- Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization. IEEE Transactions on Information Theory, 64(7):4872–4894, 2018.
- Complexity theoretic lower bounds for sparse principal component detection. In 26th Annual Conference on Learning Theory (COLT 2013), pages 1046–1066, 2013.
- The largest eigenvalues of finite rank deformation of large Wigner matrices: convergence and nonuniversality of the fluctuations. The Annals of Probability, 37(1):1–47, 2009.
- Inference in particle tracking experiments by passing messages between images. Proceedings of the National Academy of Sciences, 107(17):7663–7668, 2010.
- Almost-linear planted cliques elude the Metropolis process. In Proceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 4504–4539. SIAM, 2023.
- Statistical and computational phase transitions in group testing. In Conference on Learning Theory, pages 4764–4781. PMLR, 2022.
- Ivan Corwin. The Kardar-Parisi-Zhang equation and universality class. Random matrices: Theory and applications, 1(01):1130001, 2012.
- A note on truncated polynomials. Applied Mathematics and Computation, 134(2-3):595–605, 2003.
- On counting independent sets in sparse graphs. SIAM Journal on Computing, 31(5):1527–1541, 2002.
- Non-gaussian component analysis via lattice basis reduction. In Conference on Learning Theory, pages 4535–4547. PMLR, 2022.
- Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Physical Review E, 84(6):066106, 2011.
- Inference and phase transitions in the detection of modules in sparse networks. Physical Review Letters, 107(6):065701, 2011.
- Subexponential-time algorithms for sparse PCA. Foundations of Computational Mathematics, pages 1–50, 2023.
- Universality of approximate message passing with semirandom matrices. The Annals of Probability, 51(5):1616–1683, 2023.
- Monroe David Donsker. An invariance principle for certain probability limit theorems. 1951.
- Contextual stochastic block models. Advances in Neural Information Processing Systems, 31, 2018.
- Universality in numerical computation with random data: Case studies and analytical results. Journal of Mathematical Physics, 60(10), 2019.
- The planted matching problem: Sharp threshold and infinite-order phase transition. arXiv preprint arXiv:2103.09383, 2021.
- Fundamental limits of detection in the spiked Wigner model. Annals of Statistics, 48(2):863–885, 2020.
- Universality of random matrices and local relaxation flow. Inventiones mathematicae, 185(1):75–119, 2011.
- The eigenvalues of random symmetric matrices. Combinatorica, 1(3):233–241, 1981.
- The largest eigenvalue of rank one deformation of large Wigner matrices. Communications in Mathematical Physics, 272(1):185–228, 2007.
- Low-degree hardness of random optimization problems. In 61st Annual Symposium on Foundations of Computer Science (FOCS 2020), pages 131–140, 2020.
- Finding planted cliques using Markov chain Monte Carlo. arXiv preprint arXiv:2311.07540, 2023.
- Spectral phase transitions in non-linear Wigner spiked models. arXiv preprint arXiv:2310.14055, 2023.
- On the optimization landscape of tensor decompositions. In Advances in Neural Information Processing Systems, pages 3653–3663, 2017.
- Limits of local algorithms over sparse random graphs. In 5th Conference on Innovations in Theoretical Computer Science (ITCS 2014), pages 369–376. ACM, 2014.
- The landscape of the planted clique problem: Dense subgraphs and the overlap gap property. arXiv preprint arXiv:1904.07174, 2019.
- The power of sum-of-squares for detecting hidden structures. In 58th Annual Symposium on Foundations of Computer Science (FOCS 2017), pages 720–731, 2017.
- Community detection in the labelled stochastic block model. arXiv preprint arXiv:1209.2910, 2012.
- Low degree hardness for broadcasting on trees. arXiv preprint arXiv:2402.13359, 2024.
- Samuel B Hopkins. Statistical inference and the sum of squares method. PhD thesis, Cornell University, 2018.
- Efficient Bayesian estimation from few samples: community detection and related problems. In 58th Annual Symposium on Foundations of Computer Science (FOCS 2017), pages 379–390. IEEE, 2017.
- Counterexamples to the low-degree conjecture. In 12th Innovations in Theoretical Computer Science Conference (ITCS 2021), 2021.
- The set of solutions of random XORSAT formulae. In 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2012), pages 760–779. SIAM, 2012.
- Mark Jerrum. Large cliques elude the Metropolis process. Random Structures & Algorithms, 3(4):347–359, 1992.
- Statistical thresholds for tensor PCA. Annals of Applied Probability, 30(4):1910–1933, 2020.
- Iain M Johnstone. On the distribution of the largest eigenvalue in principal components analysis. Annals of Statistics, pages 295–327, 2001.
- Computational-statistical gap in reinforcement learning. In Conference on Learning Theory, pages 1282–1302. PMLR, 2022.
- Reconstruction on trees and low-degree polynomials. arXiv preprint arXiv:2109.06915, 2021.
- Stochastic blockmodels and community structure in networks. Physical Review E, 83(1):016107, 2011.
- Dmitriy Kunisky. Hypothesis testing with low-degree polynomials in the Morris class of exponential families. In 34th Annual Conference on Learning Theory (COLT 2021), pages 2822–2848. PMLR, 2021.
- Dmitriy Kunisky. Spectral Barriers in Certification Problems. PhD thesis, New York University, 2021.
- Notes on computational hardness of hypothesis testing: Predictions using the low-degree likelihood ratio. In Paula Cerejeiras and Michael Reissig, editors, Mathematical Analysis, its Applications and Computation, pages 1–50, Cham, 2022. Springer International Publishing.
- Mutual information in rank-one matrix estimation. In 2016 IEEE Information Theory Workshop (ITW), pages 71–75. IEEE, 2016.
- MMSE of probabilistic low-rank matrix estimation: Universality with respect to the output channel. In 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton 2015), pages 680–687. IEEE, 2015.
- Phase transitions in sparse PCA. In IEEE International Symposium on Information Theory (ISIT 2015), pages 1635–1639. IEEE, 2015.
- Statistical and computational phase transitions in spiked tensor estimation. In IEEE International Symposium on Information Theory (ISIT 2017), pages 511–515. IEEE, 2017.
- Reconstruction in the labelled stochastic block model. IEEE Transactions on Network Science and Engineering, 2(4):152–163, 2015.
- High dimensional model representations. The Journal of Physical Chemistry A, 105(33):7765–7777, 2001.
- The planted matching problem: phase transitions and exact results. arXiv preprint arXiv:1912.08880, 2019.
- Cristopher Moore. The computer science and physics of community detection: landscapes, phase transitions, and hardness. arXiv preprint arXiv:1702.00467, 2017.
- Sum-of-squares lower bounds for planted clique. In 47th Annual ACM Symposium on Theory of Computing (STOC 2015), pages 87–96. ACM, 2015.
- Adapting to unknown noise distribution in matrix denoising. arXiv preprint arXiv:1810.02954, 2018.
- On the limitation of spectral methods: From the gaussian hidden clique problem to rank-one perturbations of gaussian tensors. In Advances in Neural Information Processing Systems, pages 217–225, 2015.
- Computational barriers in minimax submatrix detection. 2015.
- Equivalence of approximate message passing and low-degree polynomials in rank-one matrix estimation. arXiv preprint arXiv:2212.06996, 2022.
- Precise error rates for computationally efficient testing. arXiv preprint arXiv:2311.00289, 2023.
- Ryan O’Donnell. Analysis of boolean functions. Cambridge University Press, 2014.
- Edwin JG Pitman. Some basic theory for statistical inference. Chapman and Hall, 1979.
- Optimality and sub-optimality of PCA I: Spiked random matrix models. Annals of Statistics, 46(5):2416–2451, 2018.
- General foundations of high-dimensional model representations. Journal of Mathematical Chemistry, 25(2-3):197–233, 1999.
- A statistical model for tensor PCA. In Advances in Neural Information Processing Systems, pages 2897–2905, 2014.
- High-dimensional estimation via sum-of-squares proofs. arXiv preprint arXiv:1807.11419, 2018.
- Is it easier to count communities than find them? In 14th Innovations in Theoretical Computer Science Conference (ITCS 2023), 2023.
- Spectral detection in the censored block model. In 2015 IEEE International Symposium on Information Theory (ISIT), pages 1184–1188. IEEE, 2015.
- Community detection with side information: Exact recovery under the stochastic block model. IEEE Journal of Selected Topics in Signal Processing, 12(5):944–958, 2018.
- Computational barriers to estimation from low-degree polynomials. The Annals of Statistics, 50(3):1833–1858, 2022.
- SR Srinivasa Varadhan. Probability theory. Number 7. American Mathematical Soc., 2001.
- Alexander S Wein. Optimal low-degree hardness of maximum independent set. Mathematical Statistics and Learning, 4(3):221–251, 2022.
- Universality of approximate message passing algorithms and tensor networks. arXiv preprint arXiv:2206.13037, 2022.
- Ram Zamir. A proof of the Fisher information inequality via a data processing argument. IEEE Transactions on Information Theory, 44(3):1246–1250, 1998.
- Statistical physics of inference: Thresholds and algorithms. Advances in Physics, 65(5):453–552, 2016.
- Lattice-based methods surpass sum-of-squares in clustering. In Conference on Learning Theory, pages 1247–1248. PMLR, 2022.