Derandomize accelerated GD on separable convex functions

Determine whether there exists a deterministic stepsize schedule for Gradient Descent that achieves the fully accelerated iteration complexity O(kappa^{1/2} log(1/eps)) on the class of separable, m-strongly convex and M-smooth functions, and in particular ascertain whether some ordering of the Chebyshev stepsize sequence attains this rate.

Background

The paper proves that using i.i.d. inverse stepsizes from the Arcsine distribution fully accelerates Gradient Descent to the optimal O(kappa^{1/2} log(1/eps)) rate on separable convex functions. This establishes acceleration without momentum by randomization.

The authors ask whether this acceleration can be achieved deterministically, i.e., by a fixed stepsize schedule. For quadratic objectives, Chebyshev stepsizes are optimal, and the authors suggest that an appropriate ordering of these stepsizes might de-randomize the accelerated behavior for separable functions. Establishing this would clarify whether randomness is essential for full acceleration in this setting.

References

Is it possible to de-randomize Theorem~\ref{thm:sep:main}, i.e., construct a deterministic stepsize schedule which achieves the fully accelerated rate for separable functions? If so, a natural conjecture would be some ordering of the Chebyshev stepsize schedule.

— Acceleration by Random Stepsizes: Hedging, Equalization, and the Arcsine Stepsize Schedule (2412.05790 - Altschuler et al., 8 Dec 2024) in Section 7 (Conclusion and future work)

Derandomize accelerated GD on separable convex functions

Background

References

Related Problems