Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A randomized algorithm for nonconvex minimization with inexact evaluations and complexity guarantees (2310.18841v2)

Published 28 Oct 2023 in math.OC and cs.LG

Abstract: We consider minimization of a smooth nonconvex function with inexact oracle access to gradient and Hessian (without assuming access to the function value) to achieve approximate second-order optimality. A novel feature of our method is that if an approximate direction of negative curvature is chosen as the step, we choose its sense to be positive or negative with equal probability. We allow gradients to be inexact in a relative sense and relax the coupling between inexactness thresholds for the first- and second-order optimality conditions. Our convergence analysis includes both an expectation bound based on martingale analysis and a high-probability bound based on concentration inequalities. We apply our algorithm to empirical risk minimization problems and obtain improved gradient sample complexity over existing works.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. “Finding approximate local minima faster than gradient descent” In STOC’17—Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing ACM, New York, 2017, pp. 1195–1199
  2. Zeyuan Allen-Zhu “Natasha 2: Faster Non-Convex Optimization Than SGD” In Advances in Neural Information Processing Systems 31 Curran Associates, Inc., 2018
  3. “Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations” In Proceedings of Thirty Third Conference on Learning Theory 125, Proceedings of Machine Learning Research PMLR, 2020, pp. 242–299
  4. Albert S Berahas, Liyuan Cao and Katya Scheinberg “Global convergence rate analysis of a generic line search algorithm with noise” In SIAM Journal on Optimization 31.2, 2021, pp. 1489–1518
  5. “A subsampling line-search method with second-order results” In INFORMS Journal on Optimization 4.4, 2022, pp. 403–425
  6. “Convergence rate analysis of a stochastic trust-region method via supermartingales” In INFORMS journal on optimization 1.2, 2019, pp. 92–119
  7. “Sample size selection in optimization methods for machine learning” In Mathematical Programming 134.1, 2012, pp. 127–155
  8. “Accelerated methods for non-convex optimization” In SIAM Journal on Optimization 28, 2018, pp. 1751–1772
  9. Coralia Cartis, Nicholas I.M. Gould and Philippe L. Toint “Adaptive Cubic Regularisation Methods for Unconstrained Optimization. Part II: Worst-Case Function- and Derivative-Evaluation Complexity” In Mathematical Programming 130.2, 2011, pp. 295–319
  10. C. Cartis, N.I.M. Gould and Ph.L. Toint “Complexity bounds for second-order optimality in unconstrained optimization” In Journal of Complexity 28, 2012, pp. 93–108
  11. Ruobing Chen, Matt Menickelly and Katya Scheinberg “Stochastic optimization using a trust-region method and random models” In Mathematical Programming 169, 2018, pp. 447–487
  12. “Global convergence rate analysis of unconstrained optimization methods based on probabilistic models” In Mathematical Programming 169.2, Ser. A, 2018, pp. 337–375
  13. “Global convergence rate analysis of unconstrained optimization methods based on probabilistic models” In Mathematical Programming 169, 2018, pp. 337–375
  14. Frank E Curtis, Katya Scheinberg and Rui Shi “A stochastic trust region algorithm based on careful step normalization” In Informs Journal on Optimization 1.3, 2019, pp. 200–220
  15. “Trust-region Newton-CG with strong second-order complexity guarantees for nonconvex optimization” In SIAM Journal on Optimization 31, 2021, pp. 518–544
  16. Rick Durrett “Probability—theory and examples” 49, Cambridge Series in Statistical and Probabilistic Mathematics Cambridge University Press, Cambridge, 2019
  17. “Escaping From Saddle Points — Online Stochastic Gradient for Tensor Decomposition” In Proceedings of The 28th Conference on Learning Theory 40, Proceedings of Machine Learning Research Paris, France: PMLR, 2015, pp. 797–842
  18. “Escaping from saddle points—online stochastic gradient for tensor decomposition” In Conference on Learning Theory, 2015, pp. 797–842
  19. Serge Gratton, Sadok Jerad and Philippe L. Toint “Convergence Properties of an Objective-Function-Free Optimization Regularization Algorithm, Including an 𝒪⁢(ϵ−3/2)𝒪superscriptitalic-ϵ32\mathcal{O}(\epsilon^{-3/2})caligraphic_O ( italic_ϵ start_POSTSUPERSCRIPT - 3 / 2 end_POSTSUPERSCRIPT ) Complexity Bound” In SIAM Journal on Optimization 33.3, 2023, pp. 1621–1646
  20. Rong Ge, Chi Jin and Yi Zheng “No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis” In Proceedings of the 34th International Conference on Machine Learning 70, Proceedings of Machine Learning Research PMLR, 2017, pp. 1233–1242
  21. Geovani Nunes Grapiglia, J Yuan and Y Yuan “On the worst-case complexity of nonlinear stepsize control algorithms for convex unconstrained optimization” In Optimization Methods and Software 31.3, 2016, pp. 591–604
  22. “How to Escape Saddle Points Efficiently” In Proceedings of the 34th International Conference on Machine Learning 70, Proceedings of Machine Learning Research PMLR, 2017, pp. 1724–1732
  23. “On Nonconvex Optimization for Machine Learning: Gradients, Stochasticity, and Saddle Points” In J. ACM 68.2, 2021
  24. Billy Jin, Katya Scheinberg and Miaolan Xie “Sample Complexity Analysis for Adaptive Optimization Algorithms with Stochastic Oracles” In arXiv preprint arXiv:2303.06838, 2023
  25. “Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing” In Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS), 2023
  26. “Stochastic Optimization for Nonconvex Problem With Inexact Hessian Matrix, Gradient, and Function” In IEEE Transactions on Neural Networks and Learning Systems, 2023, pp. 1–13
  27. “Cubic regularization of Newton method and its global performance” In Mathematical Programming, Series A 108, 2006, pp. 177–205
  28. “A stochastic line search method with expected complexity analysis” In SIAM Journal on Optimization 30.1, 2020, pp. 349–376
  29. Clément W. Royer, Michael O’Neill and Stephen J. Wright “A Newton-CG algorithm with complexity guarantees for smooth unconstrained optimization” In Mathematical Programming 180.1-2, Ser. A, 2020, pp. 451–488
  30. “Complexity analysis of second-order line-search algorithms for smooth nonconvex optimization” In SIAM Journal on Optimization 28, 2018, pp. 1448–1477
  31. Ju Sun, Qing Qu and John Wright “A geometric analysis of phase retrieval” In Information Theory (ISIT), 2016 IEEE International Symposium on, 2016, pp. 2379–2383 IEEE
  32. Ju Sun, Qing Qu and John Wright “Complete Dictionary Recovery Over the Sphere I: Overview and the Geometric Picture” In IEEE Trans. Inf. Theor. 63.2, 2017, pp. 853–884
  33. “Stochastic Cubic Regularization for Fast Nonconvex Optimization” In Advances in Neural Information Processing Systems 31 Curran Associates, Inc., 2018
  34. Stephen J. Wright and Benjamin Recht “Optimization for Data Analysis” Cambridge University Press, Cambridge, 2022
  35. Peng Xu, Fred Roosta and Michael W. Mahoney “Newton-Type Methods for Non-Convex Optimization under Inexact Hessian Information” In Mathematical Programming 184.1-2, 2020, pp. 35–70
  36. “Inexact Newton-CG algorithms with complexity guarantees” In IMA Journal of Numerical Analysis, 2022
  37. Yuqian Zhang, Qing Qu and John Wright “From symmetry to geometry: Tractable nonconvex problems” In arXiv preprint arXiv:2007.06753, 2020
Citations (3)

Summary

We haven't generated a summary for this paper yet.