Dice Question Streamline Icon: https://streamlinehq.com

Derandomize accelerated GD on separable convex functions

Determine whether there exists a deterministic stepsize schedule for Gradient Descent that achieves the fully accelerated iteration complexity O(kappa^{1/2} log(1/eps)) on the class of separable, m-strongly convex and M-smooth functions, and in particular ascertain whether some ordering of the Chebyshev stepsize sequence attains this rate.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper proves that using i.i.d. inverse stepsizes from the Arcsine distribution fully accelerates Gradient Descent to the optimal O(kappa{1/2} log(1/eps)) rate on separable convex functions. This establishes acceleration without momentum by randomization.

The authors ask whether this acceleration can be achieved deterministically, i.e., by a fixed stepsize schedule. For quadratic objectives, Chebyshev stepsizes are optimal, and the authors suggest that an appropriate ordering of these stepsizes might de-randomize the accelerated behavior for separable functions. Establishing this would clarify whether randomness is essential for full acceleration in this setting.

References

Is it possible to de-randomize Theorem~\ref{thm:sep:main}, i.e., construct a deterministic stepsize schedule which achieves the fully accelerated rate for separable functions? If so, a natural conjecture would be some ordering of the Chebyshev stepsize schedule.

Acceleration by Random Stepsizes: Hedging, Equalization, and the Arcsine Stepsize Schedule (2412.05790 - Altschuler et al., 8 Dec 2024) in Section 7 (Conclusion and future work)