Optimal momentum selection for MomPS_max in the general convex setting

Determine explicit optimal choices for the momentum coefficient β in the deterministic Heavy Ball method using the MomPS_max step-size γ_t = (1 − β) · min{ (f(x^t) − f(x^*)) / ||∇f(x^t)||^2, γ_b } when minimizing general convex L-smooth objectives (not necessarily strongly convex). The goal is to identify principled, data-independent selection rules or formulas for β in this setting, in contrast to the known optimal β for strongly convex quadratic objectives.

Background

The paper proposes Polyak-type adaptive step-sizes for the Stochastic Heavy Ball method and provides convergence guarantees in various regimes. In the deterministic setting, for strongly convex quadratic objectives, classical Heavy Ball has known optimal momentum and step-size choices, which yield acceleration.

In the supplemental 'Extra Deterministic Experiment' on logistic regression (a general convex problem), the authors note that they resorted to grid search for the momentum parameter because no optimal choice is available for this broader convex class when using the MomPS_max step-size. This highlights a concrete unresolved question about principled momentum tuning beyond the strongly convex quadratic case.

References

We have performed a grid search to find the best β for \ref{eq:mopsmax} since no optimal choices for β are known for the general convex case.

Stochastic Polyak Step-sizes and Momentum: Convergence Guarantees and Practical Performance (2406.04142 - Oikonomou et al., 6 Jun 2024) in Supplementary Material, Section “Additional Experiments”, Subsection “Extra Deterministic Experiment”