Small Gradient Norm Regret for Online Convex Optimization

Published 20 Jan 2026 in stat.ML, cs.LG, and math.OC | (2601.13519v1)

Abstract: This paper introduces a new problem-dependent regret measure for online convex optimization with smooth losses. The notion, which we call the $G^\star$ regret, depends on the cumulative squared gradient norm evaluated at the decision in hindsight $\sum_{t=1}^T |\nabla \ell(x^{\star)|^2$.} We show that the $G^\star$ regret strictly refines the existing $L^\star$ (small loss) regret, and that it can be arbitrarily sharper when the losses have vanishing curvature around the hindsight decision. We establish upper and lower bounds on the $G^\star$ regret and extend our results to dynamic regret and bandit settings. As a byproduct, we refine the existing convergence analysis of stochastic optimization algorithms in the interpolation regime. Some experiments validate our theoretical findings.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces G* regret as a novel metric that computes cumulative squared gradient norms at the hindsight minimizer, offering translation invariance over traditional L* regret.
The methodology leverages strengthened smoothness conditions and theoretical analysis to establish O(√G*) upper and lower regret bounds in static, dynamic, and bandit online convex optimization frameworks.
The findings imply that adaptive algorithms like AdaGrad-Norm and AdaFTRL achieve improved convergence rates, inspiring further research into regret bounds for nonsmooth and dynamic loss scenarios.

Small Gradient Norm Regret in Online Convex Optimization

Introduction and Context

The paper "Small Gradient Norm Regret for Online Convex Optimization" (2601.13519) introduces the concept of G* regret as a novel, problem-dependent regret metric for online convex optimization (OCO) when losses are smooth. In OCO, a learner sequentially chooses decisions from a convex set, striving to minimize cumulative loss against adversarially chosen convex loss functions. Traditional regret analyses typically seek sublinear bounds in the worst case, but recent work has explored problem-dependent bounds that adapt to the specifics of the loss sequence.

The L* regret (small-loss regret) and the gradient variation bound are two established measures that provide sharper performance guarantees under specific structural assumptions about the loss functions. However, L* regret depends on the absolute scale of the losses and often requires them to be non-negative—a problematic constraint for certain applications (e.g., linear losses, translation invariance).

G* Regret: Definition and Advantages

G* regret is defined as the cumulative squared gradient norm of the loss functions at the hindsight minimizer:

$G_T(x^*) = \sum_{t=1}^T \| \nabla \ell_t(x^*) \|^2$

where $x^*$ minimizes the aggregate loss $\sum_t \ell_t(x)$ over the feasible set. Unlike L* regret, G* regret is translation invariant and does not require non-negativity or lower boundedness of the losses. This property makes G* regret applicable in broader settings, particularly with linear losses—where L* regret is undefined but G* regret reduces to the sum of squared subgradient norms.

The paper rigorously demonstrates that G* regret can be strictly tighter than L* regret, particularly for loss functions exhibiting vanishing curvature near the optimum. In such cases (e.g., high-order regression losses, cross-entropy, exponential losses near the optimum), the gradient norms decay faster than the loss values, yielding provably sharper regret bounds.

Theoretical Results: Upper and Lower Bounds

The authors present upper and lower bounds for G* regret in static, dynamic, and bandit OCO frameworks:

Upper Bounds: Existing algorithms that adapt to L* regret also yield $O(\sqrt{G^*})$ regret bounds for G* regret. Specifically, algorithms like AdaGrad-Norm and AdaFTRL, when deployed with constant or adaptive learning rates, achieve regret bounds of order $O(D\sqrt{G^*})$ , matching explicit lower bounds for appropriate loss sequences. The analysis leverages an "improved lower bound" on smooth convex functions (strong convexity in dual space), resulting in tighter regret guarantees than standard convexity-based proofs.
Lower Bounds: The paper constructs loss sequences (notably linear losses) that demonstrate the $O(\sqrt{G^*})$ regret lower bound, certifying the optimality of the presented algorithms under smoothness and convexity assumptions.

Additionally, the theory extends to dynamic regret settings, where the comparator can change across rounds, and to the bandit setting, where only limited (function-value) feedback is available. Here, the regret bounds incorporate path length terms and additional terms arising from gradient estimation noise.

Methodological Contributions

The derivation of the G* regret bounds hinges on a strengthened smoothness condition: $l_t(x) - l_t(y) - \langle \nabla l_t(y), x - y \rangle \geq \frac{1}{2L} \| \nabla l_t(x) - \nabla l_t(y) \|^2$ , which enhances the regret analysis by directly relating gradient norms to suboptimality gaps. The paper gives a full treatment for both the static and dynamic OCO regimes, as well as bandit feedback settings using smoothing via Moreau envelopes.

Importantly, the adaptive nature of G* regret—its independence from absolute loss scale and its applicability across loss types—allows online algorithms to maintain the worst-case sublinear regret $\mathcal{O}(\sqrt{T})$ while sharply tightening bounds in structured instances.

Experimental Evidence

Numerical experiments on polynomial regression (with high $p$ ) and cross-entropy classification validate the theoretical predictions. Regret incurred by AdaGrad-Norm, compared against theoretical upper bounds based on L* and G*, shows that the G* regret upper bound tracks observed regret with significantly less slack. This empirical confirmation reinforces the practical advantages of adopting G* regret, especially in interpolated regimes where the loss landscape flattens near the optimum.

Implications and Future Directions

Practical Impact:

The refined G* regret provides sharper performance guarantees for OCO algorithms in machine learning tasks characterized by vanishing curvature, interpolation, or translation-invariant loss architectures. It facilitates the deployment of adaptive online methods (AdaGrad-Norm, AdaFTRL, etc.) in bandit and constrained settings, and improves convergence rates in stochastic optimization and average-iterate analysis.

Theoretical Relevance:

The introduction of G* regret shifts the focus from absolute loss minimization to stationarity at the comparator, harmonizing regret analysis with dual-space smoothness properties and gradient-based optimality conditions. This approach resolves several conceptual shortcomings in prior small-loss bounds and provides a unified framework for treating smoothness, constraint geometry, and non-negativity in regret analysis.

Potential Future Research:

Extension of G* regret analysis to nonsmooth losses via generalized self-boundedness.
Fully unconstrained OCO regimes and their relationship to G* regret.
Tight problem-dependent dynamic regret bounds in highly nonstationary or nonconvex environments.
Automated learning rate adaptation for unknown (future-dependent) G* values via meta-expert algorithms.
Practical deployment in large-scale online learning and adversarial optimization, including real-world bandit feedback scenarios.

Conclusion

The G* regret represents a substantive advance in problem-dependent OCO analysis, remedying several deficiencies of existing small-loss regret bounds and broadening the applicability of online learning algorithms. The paper provides a comprehensive suite of theoretical guarantees, algorithmic prescriptions, and experimental validations, substantiating the value of G* regret as a guiding metric for adaptive OCO in both offline and online machine learning applications. This framework opens pathways for more refined analyses and practical algorithms in sequential decision-making under smooth, adversarial, or stochastic loss regimes.

Markdown Report Issue