Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule (2309.07879v1)

Published 14 Sep 2023 in math.OC and cs.DS

Abstract: Can we accelerate convergence of gradient descent without changing the algorithm -- just by carefully choosing stepsizes? Surprisingly, we show that the answer is yes. Our proposed Silver Stepsize Schedule optimizes strongly convex functions in $k^{\log_{\rho} 2} \approx k^{0.7864}$ iterations, where $\rho=1+\sqrt{2}$ is the silver ratio and $k$ is the condition number. This is intermediate between the textbook unaccelerated rate $k$ and the accelerated rate $\sqrt{k}$ due to Nesterov in 1983. The non-strongly convex setting is conceptually identical, and standard black-box reductions imply an analogous accelerated rate $\varepsilon^{{-\log_{\rho}} 2} \approx \varepsilon^{-0.7864}$. We conjecture and provide partial evidence that these rates are optimal among all possible stepsize schedules. The Silver Stepsize Schedule is constructed recursively in a fully explicit way. It is non-monotonic, fractal-like, and approximately periodic of period $k^{\log_{\rho} 2}$. This leads to a phase transition in the convergence rate: initially super-exponential (acceleration regime), then exponential (saturation regime).

Authors (2)

Jason M. Altschuler (27 papers)
Pablo A. Parrilo (66 papers)

Citations (15)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Acceleration by Stepsize Hedging I: Multi-Step Descent and the Silver Stepsize Schedule (2309.07879v1)

Summary

Related Papers