Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 73 tok/s
Gemini 2.5 Pro 39 tok/s Pro
GPT-5 Medium 27 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 115 tok/s Pro
Kimi K2 226 tok/s Pro
GPT OSS 120B 461 tok/s Pro
Claude Sonnet 4 38 tok/s Pro
2000 character limit reached

Sample Complexity Analysis for Adaptive Optimization Algorithms with Stochastic Oracles (2303.06838v3)

Published 13 Mar 2023 in math.OC

Abstract: Several classical adaptive optimization algorithms, such as line search and trust region methods, have been recently extended to stochastic settings where function values, gradients, and Hessians in some cases, are estimated via stochastic oracles. Unlike the majority of stochastic methods, these methods do not use a pre-specified sequence of step size parameters, but adapt the step size parameter according to the estimated progress of the algorithm and use it to dictate the accuracy required from the stochastic approximations. The requirements on stochastic approximations are, thus, also adaptive and the oracle costs can vary from iteration to iteration. The step size parameters in these methods can increase and decrease based on the perceived progress, but unlike the deterministic case they are not bounded away from zero due to possible oracle failures, and bounds on the step size parameter have not been previously derived. This creates obstacles in the total complexity analysis of such methods, because the oracle costs are typically decreasing in the step size parameter, and could be arbitrarily large as the step size parameter goes to 0. Thus, until now only the total iteration complexity of these methods has been analyzed. In this paper, we derive a lower bound on the step size parameter that holds with high probability for a large class of adaptive stochastic methods. We then use this lower bound to derive a framework for analyzing the expected and high probability total oracle complexity of any method in this class. Finally, we apply this framework to analyze the total sample complexity of two particular algorithms, STORM and SASS, in the expected risk minimization problem.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. Lower bounds for non-convex stochastic optimization. arXiv preprint arXiv:1912.02365, 2019.
  2. Francis John Anscombe. Rejection of outliers. Technometrics, 2:123–146, 1960.
  3. A theoretical and empirical comparison of gradient approximations in derivative-free optimization. Foundations of Computational Mathematics, 2021.
  4. Convergence rate analysis of a stochastic trust-region method via supermartingales. INFORMS journal on optimization, 1(2):92–119, 2019.
  5. Optimization methods for large-scale machine learning. Siam Review, 60(2):223–311, 2018.
  6. Global convergence rate analysis of a generic line search algorithm with noise. SIAM Journal on Optimization, 2019.
  7. F. Bach and E. Moulines. Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain., pages 451–459, 2011.
  8. A sequential quadratic programming method with high probability complexity bounds for nonlinear equality constrained stochastic optimization, 2023.
  9. First-and second-order high probability complexity bounds for trust-region methods with noisy oracles. arXiv preprint arXiv:2205.03667, 2022.
  10. C. Cartis and K. Scheinberg. Global convergence rate analysis of unconstrained optimization methods based on probabilistic models. Mathematical Programming, 169(2):337–375, 2017.
  11. William Feller. An Introduction to Probability Theory and Its Applications, volume 1. Wiley, January 1968.
  12. Complexity and global rates of trust-region methods based on probabilistic models. IMA Journal of Numerical Analysis, 38(3):1579–1597, 2018.
  13. High probability complexity bounds for adaptive step search based on stochastic oracles. arXiv preprint arXiv:2106.06454, 2021.
  14. High probability complexity bounds for line search based on stochastic oracles. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 9193–9203, 2021.
  15. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  16. A guide to sample average approximation. Handbook of simulation optimization, pages 207–243, 2015.
  17. A stochastic quasi-newton method in the absence of common random numbers, 2023.
  18. A stochastic line search method with expected complexity analysis. SIAM Journal on Optimization, 30(1):349–376, 2020.
  19. Fast black-box variational inference through stochastic trust-region optimization. Advances in Neural Information Processing Systems, 30, 2017.
  20. Stochastic trust-region and direct-search methods: A weak tail bound condition and reduced sample sizing. 2023.
  21. Astro-df: A class of adaptive sampling trust-region algorithms for derivative-free stochastic optimization. SIAM Journal on Optimization, 28(4):3145–3176, 2018.
  22. James C. Spall. Stochastic optimization and the simultaneous perturbation method. In Proceedings of the 31st Conference on Winter Simulation: Simulation—a Bridge to the Future - Volume 1, WSC ’99, page 101–109, New York, NY, USA, 1999. Association for Computing Machinery.
  23. Stochastic adaptive regularization method with cubics: A high probability complexity bound. In 2023 Winter Simulation Conference (WSC), To appear, 2023.
  24. Barzilai-borwein step size for stochastic gradient descent. Advances in neural information processing systems, 29, 2016.
  25. Byzantine-robust distributed learning: Towards optimal statistical rates. In International Conference on Machine Learning, pages 5650–5659. PMLR, 2018.
Citations (13)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube