Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing (2403.10547v1)
Abstract: Finding an approximate second-order stationary point (SOSP) is a well-studied and fundamental problem in stochastic nonconvex optimization with many applications in machine learning. However, this problem is poorly understood in the presence of outliers, limiting the use of existing nonconvex algorithms in adversarial settings. In this paper, we study the problem of finding SOSPs in the strong contamination model, where a constant fraction of datapoints are arbitrarily corrupted. We introduce a general framework for efficiently finding an approximate SOSP with \emph{dimension-independent} accuracy guarantees, using $\widetilde{O}({D2}/{\epsilon})$ samples where $D$ is the ambient dimension and $\epsilon$ is the fraction of corrupted datapoints. As a concrete application of our framework, we apply it to the problem of low rank matrix sensing, developing efficient and provably robust algorithms that can tolerate corruptions in both the sensing matrices and the measurements. In addition, we establish a Statistical Query lower bound providing evidence that the quadratic dependence on $D$ in the sample complexity is necessary for computationally efficient algorithms.
- “The security of machine learning” In Machine Learning 81.2, 2010, pp. 121–148
- Afonso S Bandeira, Nicolas Boumal and Vladislav Voroninski “On the low-rank approach for semidefinite programs arising in synchronization and community detection” In Conference on learning theory, 2016, pp. 361–382 PMLR
- “A subsampling line-search method with second-order results” In INFORMS Journal on Optimization 4.4, 2022, pp. 403–425
- Battista Biggio, Blaine Nelson and Pavel Laskov “Poisoning Attacks against Support Vector Machines” In Proceedings of the 29th International Coference on International Conference on Machine Learning Omnipress, 2012, pp. 1467–1474
- Srinadh Bhojanapalli, Behnam Neyshabur and Nati Srebro “Global optimality of local search for low rank matrix recovery” In Advances in Neural Information Processing Systems, 2016, pp. 3873–3881
- “Statistical Query Algorithms and Low Degree Tests Are Almost Equivalent” In Proceedings of Thirty Fourth Conference on Learning Theory 134, Proceedings of Machine Learning Research PMLR, 2021, pp. 774–774
- Yu Cheng, Ilias Diakonikolas and Rong Ge “High-Dimensional Robust Mean Estimation in Nearly-Linear Time” In Proceedings of the 30th ACM-SIAM Symposium on Discrete Algorithms (SODA) SIAM, 2019, pp. 2755–2771
- “Non-convex matrix completion against a semi-random adversary” In Conference On Learning Theory, 2018, pp. 1362–1394 PMLR
- Yihe Dong, Samuel Hopkins and Jerry Li “Quantum entropy scoring for fast robust mean estimation and improved outlier detection” In Advances in Neural Information Processing Systems 32, 2019
- “Robust estimators in high dimensions without the computational intractability” In 57th Annual IEEE Symposium on Foundations of Computer Science—FOCS 2016 IEEE Computer Soc., Los Alamitos, CA, 2016, pp. 655–664
- “Being Robust (in High Dimensions) Can Be Practical” In Proceedings of the 34th International Conference on Machine Learning 70, Proceedings of Machine Learning Research PMLR, 2017, pp. 999–1008
- “Sever: A Robust Meta-Algorithm for Stochastic Optimization” In Proceedings of the 36th International Conference on Machine Learning 97, Proceedings of Machine Learning Research PMLR, 2019, pp. 1596–1606
- “Statistical Query Lower Bounds for List-Decodable Linear Regression” In Advances in Neural Information Processing Systems 34 Curran Associates, Inc., 2021, pp. 3191–3204
- “Streaming Algorithms for High-Dimensional Robust Statistics” In Proceedings of the 39th International Conference on Machine Learning 162, Proceedings of Machine Learning Research PMLR, 2022, pp. 5061–5117
- Ilias Diakonikolas and Daniel M Kane “Recent advances in algorithmic high-dimensional robust statistics” In arXiv preprint arXiv:1911.05911, 2019
- Ilias Diakonikolas and Daniel M. Kane “Algorithmic High-Dimensional Robust Statistics” Cambridge University Press, 2023
- Ilias Diakonikolas, Daniel M. Kane and Ankit Pensia “Outlier Robust Mean Estimation with Subgaussian Rates via Stability” In Advances in Neural Information Processing Systems 33 Curran Associates, Inc., 2020, pp. 1830–1840
- Ilias Diakonikolas, Daniel M. Kane and Alistair Stewart “Statistical Query Lower Bounds for Robust Estimation of High-Dimensional Gaussians and Gaussian Mixtures” In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), 2017, pp. 73–84
- Ilias Diakonikolas, Weihao Kong and Alistair Stewart “Efficient algorithms and lower bounds for robust linear regression” In Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms SIAM, Philadelphia, PA, 2019, pp. 2745–2754
- Rick Durrett “Probability—theory and examples” 49, Cambridge Series in Statistical and Probabilistic Mathematics Cambridge University Press, Cambridge, 2019
- “Statistical Algorithms and a Lower Bound for Detecting Planted Cliques” In J. ACM 64.2, 2017, pp. 8:1–8:37
- Chao Gao “Robust regression via mutivariate regression depth” In Bernoulli 26.2, 2020, pp. 1139–1170
- “Robust Matrix Sensing in the Semi-Random Model” In Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS), 2023
- “Escaping from saddle points—online stochastic gradient for tensor decomposition” In Conference on Learning Theory, 2015, pp. 797–842
- Rong Ge, Chi Jin and Yi Zheng “No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis” In Proceedings of the 34th International Conference on Machine Learning 70, Proceedings of Machine Learning Research PMLR, 2017, pp. 1233–1242
- Rong Ge, Jason D Lee and Tengyu Ma “Matrix completion has no spurious local minimum” In Advances in Neural Information Processing Systems, 2016, pp. 2973–2981
- Jack K Hale “Asymptotic behavior of dissipative systems” American Mathematical Soc., 2010
- “Robust statistics. The approach based on influence functions” Wiley New York, 1986
- “Robust statistics” Wiley New York, 2009
- Peter J. Huber “Robust estimation of a location parameter” In Annals of Mathematical Statistics 35, 1964, pp. 73–101
- Eirini Ioannou, Muni Sreenivas Pydi and Po-Ling Loh “Robust empirical risk minimization via Newton’s method” In arXiv preprint arXiv:2301.13192, 2023
- “How to Escape Saddle Points Efficiently” In Proceedings of the 34th International Conference on Machine Learning 70, Proceedings of Machine Learning Research PMLR, 2017, pp. 1724–1732
- “On nonconvex optimization for machine learning: gradients, stochasticity, and saddle points” In Journal of the ACM 68.2, 2021, pp. Art. 11\bibrangessep29
- M.J. Kearns “Efficient noise-tolerant Learning from Statistical Queries” In Journal of the ACM 45.6, 1998, pp. 983–1006
- “Worldwide human relationships inferred from genome-wide patterns of variation” In Science 319, 2008, pp. 1100–1104
- “Nonconvex robust low-rank matrix recovery” In SIAM Journal on Optimization 30.1, 2020, pp. 660–686
- “Non-convex low-rank matrix recovery with arbitrary outliers via median-truncated gradient descent” In Information and Inference: A Journal of the IMA 9.2, 2020, pp. 289–325
- K.A. Lai, A.B. Rao and S. Vempala “Agnostic Estimation of Mean and Covariance” In focs2016, 2016, pp. 665–674
- Shuyao Li and Stephen J. Wright “A randomized algorithm for nonconvex minimization with inexact evaluations and complexity guarantees” In arXiv preprint arXiv:2310.18841, 2023
- “Ancestry Informative Markers for Fine-Scale Individual Assignment to Worldwide Populations” In Journal of Medical Genetics 47, 2010, pp. 835–847
- “Robust estimation via robust gradient estimation” In Journal of the Royal Statistical Society. Series B. Statistical Methodology 82.3, 2020, pp. 601–627
- “Genetic structure of human populations” In Science 298, 2002, pp. 2381–2385
- Maxim Raginsky, Alexander Rakhlin and Matus Telgarsky “Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis” In Proceedings of the 2017 Conference on Learning Theory 65, Proceedings of Machine Learning Research PMLR, 2017, pp. 1674–1703
- J. Steinhardt, P.Wei Koh and P.S. Liang “Certified Defenses for Data Poisoning Attacks” In Advances in Neural Information Processing Systems 30, 2017, pp. 3520–3532
- Ju Sun, Qing Qu and John Wright “A geometric analysis of phase retrieval” In Information Theory (ISIT), 2016 IEEE International Symposium on, 2016, pp. 2379–2383 IEEE
- Ju Sun, Qing Qu and John Wright “Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture” In IEEE Trans. Inf. Theor. 63.2, 2017, pp. 853–884
- J.W. Tukey “Mathematics and picturing of data” In Proceedings of ICM 6, 1975, pp. 523–531
- “High-dimensional data analysis with low-dimensional models: Principles, computation, and applications” Cambridge University Press, 2022
- “Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning” In Proceedings of the 36th International Conference on Machine Learning 97, Proceedings of Machine Learning Research PMLR, 2019, pp. 7074–7084
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.