Papers
Topics
Authors
Recent
2000 character limit reached

Optimal Halpern Method (OHM) Overview

Updated 20 November 2025
  • Optimal Halpern Method (OHM) is a family of parameter-free iterative algorithms for nonexpansive and monotone operator problems with provably near-optimal convergence rates.
  • It leverages Halpern fixed-point theory with prescribed and adaptive weight schedules to achieve tight rates, matching lower bounds up to logarithmic factors.
  • OHM extends to variational inequalities, operator equations, and optimal transport, offering robust performance in high-dimensional optimization and accelerated numerical schemes.

The Optimal Halpern Method (OHM) is a family of parameter-free, theoretically optimal iterative algorithms for monotone inclusion, variational inequalities, operator equations, saddle-point problems, and optimal transport, unified by their origins in Halpern fixed-point theory. OHM achieves near-optimal rates in operator norm reduction or fixed-point residual, matching known lower bounds up to logarithmic factors. It is positioned at the intersection of nonexpansive operator theory, monotone operator splitting, and modern optimization, operating in Hilbert (and occasionally Banach) spaces, with explicit convergence guarantees, robustness to inexactness, and deep ties to accelerated first-order methods such as Nesterov acceleration.

1. Foundations: Halpern Iteration and Its Parameter Schedules

The classical Halpern iteration for a nonexpansive mapping T:HHT: \mathcal{H} \to \mathcal{H} in a (real) Hilbert space is given by: xk+1=λku+(1λk)T(xk)x_{k+1} = \lambda_k u + (1 - \lambda_k) T(x_k) where uu is a fixed anchor and (λk)(\lambda_k) is a sequence in (0,1)(0,1) with λk0\lambda_k \to 0.

The OHM distinguishes itself by prescribing the weight schedule λk=1/(k+2)\lambda_k = 1/(k+2) (or variants), which is optimal for worst-case decay of the fixed-point residual xkT(xk)\|x_k - T(x_k)\| (He et al., 16 May 2025, Tran-Dinh, 2022). The method can be further generalized using adaptive weights computed at each step: αk=1φk+1,φk=1+2xk1Txk1,uxk1xk1Txk12\alpha_k = \frac{1}{\varphi_k + 1}, \quad \varphi_k = 1 + \frac{2\langle x^{k-1}-T x^{k-1}, u - x^{k-1}\rangle}{\|x^{k-1} - T x^{k-1}\|^2} yielding potentially faster convergence in practice (He et al., 16 May 2025).

OHM achieves the tight rate xkTxk2k+1ux\|x_k - T x_k\| \leq \frac{2}{k+1} \|u - x^*\| for any fixed point xFix(T)x^*\in \operatorname{Fix}(T), which is unimprovable for general nonexpansive TT (He et al., 16 May 2025, Tran-Dinh, 2022, Cheval et al., 2023).

2. OHM for Monotone Operator Equations and Variational Inequalities

OHM extends naturally to monotone inclusion problems: 0F(u)+IU(u)0 \in F(u) + \partial I_U(u) for monotone, Lipschitz F:UEF: U \to E, with UU convex and closed in a Hilbert space. Utilizing the fact that the proximal residual P(u)=uJF+IU(u)P(u) = u - J_{F+\partial I_U}(u) is $1/2$-cocoercive, OHM applies the update: uk+1=λk+1u0+(1λk+1)(ukP(uk))u_{k+1} = \lambda_{k+1} u_0 + (1 - \lambda_{k+1})(u_k - P(u_k)) with λk=1/(k+1)\lambda_k=1/(k+1) (Diakonikolas, 2020).

When FF is $1/L$-cocoercive, the explicit update simplifies to: uk+1=λk+1u0+(1λk+1)(uk2LF(uk))u_{k+1} = \lambda_{k+1} u_0 + (1 - \lambda_{k+1}) \left( u_k - \frac{2}{L} F(u_k) \right) which produces the operator residual bound F(uk)=O(Lu0u/k)\|F(u_k)\| = O(L \|u_0 - u^*\|/k), with parameter-freeness ensured by an online Lipschitz estimation/doubling procedure (Diakonikolas, 2020).

OHM, when combined with extragradient or resolvent-based approximations, provides guarantees for strong Stampacchia solutions to variational inequalities, as small residuals imply primal-dual optimality gaps (Diakonikolas, 2020).

3. Rate Guarantees, Oracle Complexity, and Acceleration

OHM achieves convergence rates that are (up to logarithmic factors) optimal in the black-box model for operator equations:

  • For $1/L$-cocoercive FF: F(uk)=O(Lu0u/k)\|F(u_k)\| = O(L \|u_0 - u^*\|/k), total oracle calls O(Lu0u/ϵ)O(L \|u_0 - u^*\|/\epsilon).
  • In the monotone + Lipschitz case: O~((L+1)u0u/ϵ)\widetilde{O}((L+1)\|u_0-u^*\|/\epsilon) total calls.
  • In the strongly monotone case: complexity O~(L/mlog(1/ϵ))\widetilde{O}(L/m \log(1/\epsilon)) using a logarithmic-restart schedule.
  • For stochastic monotone problems, variance-reduced OHM variants attain O(1/ϵ3)O(1/\epsilon^3) stochastic oracle calls in general and O(log(1/ϵ)/ϵ2)O(\log(1/\epsilon)/\epsilon^2) under strong monotonicity (Cai et al., 2022).

OHM's Lyapunov analysis shows that for 1/L-cocoercive maps,

Gyk24L2y0y2(k+1)(k+3)\|G y_k\|^2 \leq \frac{4L^2 \|y_0-y^*\|^2}{(k+1)(k+3)}

which corresponds exactly to the O(1/k) rate in the residual, and O(1/k2) decay in squared norm (Tran-Dinh, 2022).

Optimality is certified by lower-bound constructions in variational inequality and saddle-point settings (Diakonikolas, 2020, Tran-Dinh, 2022).

4. Methodological Connections and Extensions

The Optimal Halpern Method admits direct equivalence to Nesterov's acceleration for monotone operator problems when the underlying operator is cocoercive; through a change of variables, the Halpern iteration becomes a momentum-based scheme with calibrated parameter choices: yk+1=βky0+(1βk)ykηkGyk,βk=1/(k+2),ηk=2(1βk)/Ly_{k+1} = \beta_k y_0 + (1-\beta_k) y_k - \eta_k G y_k, \quad \beta_k = 1/(k+2), \, \eta_k = 2(1-\beta_k)/L and recovers the sharp O(1/k2) Lyapunov rate for the squared residual (Tran-Dinh, 2022, Tran-Dinh et al., 2021).

OHM generalizes to composite and splitting scenarios, e.g., the forward–backward or Douglas–Rachford algorithms for sums of maximally monotone operators and L-Lipschitz maps: Gγ(x)=1γ(xJγA(xγB(x)))G_\gamma(x) = \frac{1}{\gamma}(x - J_{\gamma A}(x - \gamma B(x))) Accelerated Halpern-anchored splits achieve O(1/k)O(1/k) last-iterate residual decay under only maximal monotonicity, with Popov-like and ADR variants reducing oracle or resolvent calls per iteration (Tran-Dinh et al., 2021).

Inexactness is handled robustly by Halpern-accelerated inexact Proximal Point Methods (HiPPM), allowing summable error tolerances and retaining optimal sublinear or linear rates under strong monotonicity (Zhang et al., 13 Nov 2025).

5. Practical Implementations and Complexity in Applications

OHM has been implemented and extensively tested across several large-scale optimization contexts:

  • In high-dimensional LASSO, adaptive anchoring reduces total iterations and compute time over standard Halpern iteration by a factor of 5 or more (He et al., 16 May 2025).
  • In discrete optimal transport with square–ℓ₂ ground cost on m×n grids, the HOT algorithm combines Halpern-accelerated ADMM with direct O(M) linear system solvers per iteration, achieving ε-accuracy in O(M{1.5}/ε) flops. Key steps include block Gaussian elimination, Sherman-Morrison-Woodbury inversion for reduced LPs, and greedy recovery of the primal transport plan (Zhang et al., 1 Aug 2024). This improves the best known complexity bounds for regularized or unregularized OT solvers in this setting.

OHM's insensitivity to parameter specification and robustness to inexact subproblem solutions (including extragradient and mini-batch stochastic settings) has been repeatedly emphasized as central to its practical performance and theoretical guarantees (Diakonikolas, 2020, Zhang et al., 13 Nov 2025, Zhang et al., 1 Aug 2024).

6. Summary Table: OHM Algorithmic Core and Rates

Algorithmic Scenario Update Form Rate / Complexity
Nonexpansive fixed point xk+1=λku+(1λk)T(xk)x_{k+1} = \lambda_k u + (1-\lambda_k)T(x_k) xkTxk=O(1/k)\|x_k - T x_k\| = O(1/k) [tight]
Cocoercive ($1/L$) xk+1=λku+(1λk)(xk2LF(xk))x_{k+1} = \lambda_k u + (1-\lambda_k)(x_k - \frac{2}{L}F(x_k)) F(xk)=O(Luu/k)\|F(x_k)\| = O(L\|u-u^*\|/k), O~(L/ϵ)\widetilde O(L/\epsilon) calls
Monotone + Lipschitz xk+1=λku+(1λk)(xkP(xk))x_{k+1} = \lambda_k u + (1-\lambda_k)(x_k - P(x_k)) O~((L+1)/ϵ)\widetilde O((L+1)/\epsilon) oracle calls
Stochastic, variance reduced As above + PAGE estimator, restarts O(1/ϵ3)O(1/\epsilon^3) calls; O(log(1/ϵ)/ϵ2)O(\log(1/\epsilon)/\epsilon^2) under sharpness
Inexact PPM / Augmented Lagrangian zk+1=λku+(1λk)zˉkz^{k+1} = \lambda_k u + (1-\lambda_k)\bar z^k O(1/k2)\mathcal O(1/k^2) (squared residual), linear under strong monotonicity
Discrete OT ("HOT") Halpern–ADMM splitting on reduced-dual model O(M{1.5}/ε) flop count (Zhang et al., 1 Aug 2024)

Parameter choice for weights: typically λk=1/(k+2)\lambda_k = 1/(k+2), or adaptively via inner product–dependent rule.

7. Theoretical Significance and Future Prospects

The OHM captures the best possible (i.e., tight) rates for fixed-point residuals or operator norm decay in monotone inclusion, saddle-point, and variational inequality settings, with or without strong monotonicity, regularity, or stochasticity. Its equivalence to one-step acceleration, in contrast to momentum-based approaches, offers new perspectives for first-order optimization, monotone operator theory, and splitting schemes.

Current research extends OHM to adaptive anchoring, stochastic frameworks, inexact oracles, variable-metric spaces, and application-specific structure (e.g., optimal transport, regularized learning). Open questions include extending adaptive variants to broader Banach or hyperbolic settings, exploiting finer local regularity, and systematically deriving accelerated splitting algorithms beyond the Hilbert setting (He et al., 16 May 2025, Cheval et al., 2023, Zhang et al., 13 Nov 2025).

OHM thus serves both as a universal meta-algorithm for nonexpansive and monotone operator problems and as a concrete tool for designing optimal, parameter-free iterative solvers in advanced convex optimization and variational analysis (Diakonikolas, 2020, Tran-Dinh, 2022).

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Optimal Halpern Method (OHM).