Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule (2309.07879v1)

Published 14 Sep 2023 in math.OC and cs.DS

Abstract: Can we accelerate convergence of gradient descent without changing the algorithm -- just by carefully choosing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in $k{\log_{\rho} 2} \approx k{0.7864}$ iterations, where $\rho=1+\sqrt{2}$ is the silver ratio and $k$ is the condition number. This is intermediate between the textbook unaccelerated rate $k$ and the accelerated rate $\sqrt{k}$ due to Nesterov in 1983. The non-strongly convex setting is conceptually identical, and standard black-box reductions imply an analogous accelerated rate $\varepsilon{-\log_{\rho} 2} \approx \varepsilon{-0.7864}$. We conjecture and provide partial evidence that these rates are optimal among all possible stepsize schedules. The Silver Stepsize Schedule is constructed recursively in a fully explicit way. It is non-monotonic, fractal-like, and approximately periodic of period $k{\log_{\rho} 2}$. This leads to a phase transition in the convergence rate: initially super-exponential (acceleration regime), then exponential (saturation regime).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Jason M. Altschuler (27 papers)
  2. Pablo A. Parrilo (66 papers)
Citations (15)

Summary

We haven't generated a summary for this paper yet.