Determine the minimax-optimal constant stepsize for gradient descent
Determine, for a given number of iterations N and parameters L > 0 and D > 0, the constant stepsize α(N) that minimizes the worst-case final objective gap of gradient descent over the class of L-smooth convex functions: run x_{k+1} = x_k − (α/L) ∇f(x_k) for k = 0,…,N−1 from an initial point x_0 with ∥x_0 − x_⋆∥ ≤ D (where x_⋆ is a minimizer of f), and determine the α(N) that minimizes sup over all such (f, x_0) of f(x_N) − inf f.
Sponsor
References
This note addresses the open problem of determining the minimax optimal constant stepsize for gradient descent: Given N, identify the stepsize α(N)∈ minimizing the worst-case final objective gap \begin{equation} \label{eq:minimax-design} \min_{\tilde \alpha\in\mathbb{R} \max_{(f,x_0)\in\mathcal{F}{L,D} f(x_N) - \inf f, \end{equation} where \mathcal{F}{L,D} denotes the set of (f,x_0) such that f is an L-smooth convex function with a minimizer x_\star satisfying |x_0-x_\star|\leq D and x_N is the output of N steps of gradient descent with constant stepsize h_k=\tilde \alpha.