Dice Question Streamline Icon: https://streamlinehq.com

Drori–Teboulle conjecture on the minimax-optimal constant stepsize for gradient descent

Prove that for any N, L, D, the unique α(N) ≥ 1 solving 1/(2(2Nα + 1)) = (1/2)(1 − α)^{2N} is the unique minimizer of the worst-case final objective gap over all L-smooth convex functions with initial points within distance D of a minimizer when running N steps of gradient descent with constant stepsize α (i.e., x_{k+1} = x_k − (α/L) ∇f(x_k)), and that the minimax value equals r(N) L D^2, where r(N) is the common value of the two sides of the defining equality.

Information Square Streamline Icon: https://streamlinehq.com

Background

Drori and Teboulle (2012) conjectured that the minimax-optimal constant stepsize is precisely the one that equalizes the worst-case final objective gap on the one-dimensional quadratic Q(x) = (L/2) x2 and the Huber function H_δ(x), with δ chosen as δ = D/(2Nα + 1). This yields an implicit equation in α whose unique solution α(N) is conjectured to be optimal.

The present paper restates this conjecture and provides extensive numerical evidence via structured certificates up to N = 20160, but a formal proof is not given. Establishing this conjecture would resolve the open minimax design problem for constant stepsizes.

References

\begin{conjecture} [{Conjecture 3.1]\label{conj:main} For any N,L,D, let α(N)≥1 denote the unique solution of \frac{1}{2(2Nα+1)} = \frac{1}{2}(1-α){2N} and let r(N) denote their common value. Then, α(N) is the unique minimizer of~eq:minimax-design and achieves the value \begin{equation*} \min_{\tilde \alpha\in\mathbb{R} \max_{(f,x_0)\in\mathcal{F}_{L,D} f(x_N) - \inf f = r(N) LD2. \end{equation*} \end{conjecture}

A Strengthened Conjecture on the Minimax Optimal Constant Stepsize for Gradient Descent (2407.11739 - Grimmer et al., 16 Jul 2024) in Conjecture (cited as [Drori and Teboulle 2012, Conjecture 3.1]), Section 1 (Introduction)