Optimal gradient-norm rate for gradient descent stepsizes
Ascertain whether O(N^{-log_2(1+√2)}) is the optimal worst-case convergence rate for the terminal gradient norm ∥∇f(x_N)∥ (or equivalently its square) achievable by gradient descent on L-smooth convex functions across all possible stepsize selections.
References
If such a symmetry is fundamental, conjecturing O(1/N{\log_2()}) is also the optimal rate for gradient norm convergence is well motivated.
                — Accelerated Objective Gap and Gradient Norm Convergence for Gradient Descent via Long Steps
                
                (2403.14045 - Grimmer et al., 20 Mar 2024) in Remark (On Optimality of Rates)