Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Local Quadratic Convergence of Stochastic Gradient Descent with Adaptive Step Size (2112.14872v1)

Published 30 Dec 2021 in math.OC and cs.LG

Abstract: Establishing a fast rate of convergence for optimization methods is crucial to their applicability in practice. With the increasing popularity of deep learning over the past decade, stochastic gradient descent and its adaptive variants (e.g. Adagrad, Adam, etc.) have become prominent methods of choice for machine learning practitioners. While a large number of works have demonstrated that these first order optimization methods can achieve sub-linear or linear convergence, we establish local quadratic convergence for stochastic gradient descent with adaptive step size for problems such as matrix inversion.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Adityanarayanan Radhakrishnan (22 papers)
  2. Mikhail Belkin (76 papers)
  3. Caroline Uhler (91 papers)

Summary

We haven't generated a summary for this paper yet.