Simple adaptive first-order method with optimal preconditioned complexity

Determine whether a simple adaptive first-order method can achieve iteration complexity O(κ⋆ log(1/ε)) or stronger, where κ⋆ denotes the condition number achievable by the optimal preconditioner, for smooth strongly convex optimization.

Background

The paper reviews prior work showing that the hypergradient approach of Künstner et al. achieves O(√n κ⋆ log(1/ε)) complexity using a cutting-plane subroutine to update a diagonal preconditioner, leading to dimension dependence and added computational overhead.

In this context, κ⋆ represents the condition number of the problem after applying the optimal preconditioner constrained to a chosen class (e.g., diagonal or positive semidefinite matrices). The authors note that it remained unknown in the literature whether a simpler adaptive first-order method—without heavy subroutines—could reach O(κ⋆ log(1/ε)) complexity or better.

References

Whether a simple adaptive first-order method can achieve \mathcal{O} (\kappa{\star} \log (1 / \varepsilon)) complexity or even stronger guarantees remains open.

Gradient Methods with Online Scaling (2411.01803 - Gao et al., 4 Nov 2024) in Section 1 (Introduction)