Dice Question Streamline Icon: https://streamlinehq.com

Strengthened conjecture: existence of a low-rank PEP certificate for the conjectured GD rate

Establish, for each N, the existence of nonnegative multipliers λ_{ij} defining a performance-estimation certificate with the following structured form: λ_{⋆,j} = c_j for j = 0,…,N; λ_{i,i+1} = a_i for i = 0,…,N−1; λ_{i+1,i} = b_i for i = 0,…,N−2; and λ_{i,j} = d_i c_j for 0 ≤ i ≤ N−2 and j ≥ i+2; with all other entries zero, where a ∈ R^{N}_{>0}, b ∈ R^{N−1}_{>0}, c ∈ R^{N+1}_{>0}, d ∈ R^{N−1}_{>0}. The certificate must verify the identity ∑_{i,j} λ_{ij} Q_{ij} = f(x_⋆) − f(x_N) + r(N) (∥x_0 − x_⋆∥^2 − ∥(x_0 − (1/(2 r(N))) ∑_{i=0}^N c_i ∇f(x_i)) − x_⋆∥^2), where Q_{ij} = f(x_i) − f(x_j) − ⟨∇f(x_j), x_i − x_j⟩ − (1/2)∥∇f(x_i) − ∇f(x_j)∥^2 and r(N) is defined by the Drori–Teboulle equation 1/(2(2Nα + 1)) = (1/2)(1−α)^{2N} with α = α(N). This would yield a rank-one slack certificate proving the conjectured worst-case rate for gradient descent with constant stepsize.

Information Square Streamline Icon: https://streamlinehq.com

Background

To prove the minimax-optimality conjecture via performance estimation (PEP), one needs to construct multipliers λ{ij} ≥ 0 that certify f(x_N) − f(x⋆) ≤ r(N) ∥x_0 − x_⋆∥2 by expressing a nonnegative linear combination of interpolation inequalities Q_{ij}. The authors propose a highly structured, low-rank form for the multipliers that depends only on O(N) parameters and induces a rank-one slack term.

This strengthened conjecture identifies explicit vectors a, b, c, d describing all nonzero entries of the certificate matrix λ and an identity that would complete the PEP proof of the Drori–Teboulle minimax stepsize conjecture. The paper provides extensive numerical evidence supporting the existence of such certificates up to N = 20160, but does not supply a formal proof.

References

\begin{conjecture}\label{conj:strengthened} There exists a collection of multipliers λ{i,j}≥ 0 of the form \begin{align*} λ = \begin{pmatrix} 0 & c_0 & c_1 & c_2 & c_3 & \dots & c_N\ 0 & 0 & a_0 & d_0 c_2 & d_0 c_3 & \dots & d_0 c_N\ & b_0 & 0 & a_1 & d_1 c_3 & \dots & d_1c_N\ & & b_1 & \ddots & \ddots & \ddots & \vdots\ & & & \ddots & \ddots & \ddots & d{N-2}c_N\ & & & & b_{N-2}& 0& a_{N-1}\ & & & & & 0 & 0 \end{pmatrix}, \end{align*} such that \begin{align} \label{eq:target_identity} \sum_{ij} λ{ij} Q{ij} = f_\star - f_N + r \left(|x_0 - x_\star|2 - {\left(x_0-\frac{1}{2r}\sum_{i=0}Nc_ig_i\right)-x_\star}2\right). \end{align} Here, a, b,c,d are positive vectors of the appropriate dimensions. \end{conjecture}

A Strengthened Conjecture on the Minimax Optimal Constant Stepsize for Gradient Descent (2407.11739 - Grimmer et al., 16 Jul 2024) in Conjecture, Section 3 (A Strengthened Conjecture)