Optimality of the Silver Stepsize Schedule among deterministic schedules

Prove that the Silver Convergence Rate O(kappa^{log_{1+sqrt{2}} 2} log(1/eps)) achieved by the Silver Stepsize Schedule is optimal among all deterministic stepsize schedules for Gradient Descent on the class of m-strongly convex and M-smooth convex functions, thereby ruling out any deterministic schedule achieving a strictly better asymptotic dependence on the condition number.

Background

Recent work introduces the Silver Stepsize Schedule, a deterministic, fractal-like stepsize sequence for Gradient Descent that attains a partially accelerated rate O(kappa^{{log_{1+sqrt{2}}} 2} log(1/eps)) on general convex functions. This improves substantially over the classic O(kappa log(1/eps)) rate but does not match the optimal O(kappa^{1/2} log(1/eps)) known for momentum-based methods.

The paper notes that this Silver rate is conjectured to be optimal among deterministic stepsize schedules. Confirming this conjecture would establish a fundamental limitation for purely deterministic stepsize tuning in general convex optimization and would highlight a genuine advantage of randomization or additional algorithmic structure (e.g., momentum).

References

This rate is conjecturally optimal among all possible deterministic stepsize schedules~\citep{alt23hedging1} and naturally extends to non-strongly convex optimization~\citep{alt23hedging2}.

— Acceleration by Random Stepsizes: Hedging, Equalization, and the Arcsine Stepsize Schedule (2412.05790 - Altschuler et al., 8 Dec 2024) in Section 1.2 (Contribution and discussion)

Optimality of the Silver Stepsize Schedule among deterministic schedules

Background

References

Related Problems