Halpern's Method with Optimal Parameters

Updated 20 November 2025

Halpern's method with optimal parameters is a fixed-point iteration technique achieving O(1/k²) minimax convergence for nonexpansive mappings in Hilbert and Banach spaces.
It employs explicit parameter schedules, both deterministic (e.g., 1/(k+1)) and adaptive variants, to rigorously optimize convergence rates across various applications.
Empirical tests in convex optimization, random LASSO, and image deblurring validate its theoretical guarantees and showcase significant acceleration in practical scenarios.

Halpern's method with optimal parameters refers to a class of first-order fixed-point algorithms for nonexpansive mappings that achieve the sharp or minimax convergence rate for finding fixed points in Hilbert and Banach spaces. The method’s optimality has been rigorously established through both explicit parameter choices and a general algebraic framework that characterizes all step-size schedules yielding the same best-possible rate. Adaptive and deterministic (predetermined) variants have found application in convex optimization, Markov decision processes (MDPs), signal recovery, and image processing.

1. Classical Halpern Iteration and Motivation

Given a real Hilbert space $\mathcal{H}$ , consider a nonexpansive mapping $T:\mathcal{H}\to\mathcal{H}$ , i.e., $\|Tx-Ty\| \le \|x-y\|$ for all $x, y \in \mathcal{H}$ . The fixed-point set is $Fix(T) = \{x : Tx = x\}$ , assumed nonempty. The original Halpern iteration, introduced in 1967, defines the sequence: $x^{k+1} = \alpha_k u + (1 - \alpha_k) T x^k, \qquad k=0,1,2,\dots$ where $u\in\mathcal{H}$ is a fixed “anchor”, $x^0$ is arbitrary, and the weights $\{\alpha_k\}$ , referred to as anchoring parameters, satisfy $\alpha_k\to0$ , $\sum_k \alpha_k = \infty$ , and certain regularity conditions for strong convergence. For the “open-loop” case, a canonical optimal deterministic schedule is $\alpha_k = 1/(k+1)$ (He et al., 16 May 2025, Yoon et al., 18 Nov 2025).

2. Halpern's Method with Optimally Tuned Parameters

The optimal deterministic instance of Halpern's method, sometimes called the Optimally-tuned Halpern Method (OHM, Editor's term), selects

$y_{k+1} = \frac{1}{k+2} y_0 + \frac{k+1}{k+2} T y_k, \qquad k = 0,1,2,\dots$

This schedule achieves the exact worst-case rate for nonexpansive $T$ : for any initial $y_0$ and fixed point $y_\star$ ,

$\|y_{N-1} - T y_{N-1}\|^2 \le \frac{4 \|y_0 - y_\star\|^2}{N^2}$

or $O(1/k^2)$ nonasymptotically. This is minimax optimal over all fixed-step Halpern-type iterations (Yoon et al., 18 Nov 2025). The so-called H-dual algorithm, defined by

$y_{k+1} = y_k + \frac{N-(k+1)}{N-k}\big(Ty_k - Ty_{k-1}\big), \qquad y_{-1} = y_0$

is an anti-diagonal transpose in the underlying algebraic representation and achieves the same rate.

3. H-matrix Formalism, Invariants, and the Complete Optimal Family

All such optimal fixed-point methods can be embedded in a lower-triangular matrix algebra. Any fixed-step first-order iteration with $N-1$ oracle calls can be written as: $y_{k+1} = y_k - \sum_{j=0}^{k} h_{k+1,j+1} (y_j - T y_j)$ for explicit scalars $h_{k, j}$ forming a lower-triangular $H$ -matrix. The H-invariants are specific symmetric homogeneous polynomials in the entries of $H$ : $P(N-1, m; H) = \frac{1}{N}\binom{N}{m+1}$ for $m=1,\dots,N-1$ . Any $H$ -matrix satisfying these values, together with nonnegativity of certain H-certificates (dual multipliers arising from a sum-of-squares characterization), defines a method with the exact $O(1/k^2)$ minimax convergence. Both OHM and its H-dual arise as extreme points, and every choice of (top vs bottom) certificate sparsity yields an explicit, optimal algorithm. Hence, Halpern’s method with optimal parameters constitutes a particular member of an infinite family of extremal rate-attaining algorithms (Yoon et al., 18 Nov 2025).

4. Adaptive Parameter Selection and Enhanced Convergence

An adaptive rule for anchoring parameters was introduced by He–Xu–Dong–Mei, where, for $k \ge 1$ : $x^k = \frac{1}{\varphi_k + 1} x^0 + \frac{\varphi_k}{\varphi_k + 1} T x^{k-1},\qquad \varphi_k = 1 + \frac{2\langle x^{k-1} - T x^{k-1}, x^0 - x^{k-1}\rangle}{\|x^{k-1}-T x^{k-1}\|^2}$ and $\alpha_k = 1/(\varphi_k+1)$ satisfies $\alpha_k \le 1/(k+1)$ . This construction enforces the key identity

$\|x^{k-1} - T x^{k-1}\|^2 = \frac{2}{\varphi_k+1}\langle x^{k-1} - T x^{k-1}, x^0 - T x^{k-1}\rangle$

which underpins the convergence proof. This adaptive scheme converges strongly under both $\sum_k \alpha_k = \infty$ and $\sum_k \alpha_k < \infty$ , in distinction to the open-loop case, and often achieves a faster decay, since empirically $\varphi_k = O(k^2)$ yields practical rates significantly better than $O(1/k)$ (He et al., 16 May 2025).

5. Applications and Numerical Performance

The optimal and adaptive Halpern variants have seen application in convex optimization, random LASSO, and $\ell_1$ -regularized signal recovery. In image deblurring and random LASSO, adaptive Halpern iteration outpaces the classical schedule by a significant factor. For instance, with tolerance $\|x^k-Tx^k\|<10^{-4}$ on LASSO, the adaptive method completes in an order of magnitude fewer iterations and CPU seconds than the classical choice. Empirically, the adaptive method’s parameter $\varphi_k$ grows super-linearly, often quadratically, leading to practical acceleration (He et al., 16 May 2025).

Problem	Adaptive Iter (Alg 3.1)	Halpern $(1/(k+1))$
LASSO (120,512,20)	4,245 iters, 1.62 s	48,256 iters, 17.82 s

This suggests substantial gains in practical convergence arising from adaptivity.

Beyond optimization, Halpern's optimally-anchored iteration has recently been leveraged in model-free average-reward MDPs. There, the iteration

$Q^{k+1} = (1-\beta_{k+1}) Q^0 + \beta_{k+1} \left[r + P(\max_a Q^k)\right],\quad \beta_k = k/(k+2)$

yields, through recursive sampling and careful residual control, a sample complexity of $\widetilde O(|S||A|\|h^*\|_{\text{sp}}^2/\epsilon^2)$ , matching information-theoretic lower bounds up to a factor of $\|h^*\|_{\text{sp}}$ and guaranteeing finite termination without requiring prior knowledge of problem-dependent parameters (Lee et al., 6 Feb 2025).

6. Core Theoretical Guarantees and Proof Structure

For nonexpansive $T$ in Hilbert space:

Strong convergence: For predetermined or adaptive optimal parameters, $x^k \to x^*$ strongly for some $x^* \in Fix(T)$ .
Minimax optimal rate: For deterministic optimal parameters, $\|y_{N-1} - Ty_{N-1}\|^2 \le 4 \|y_0 - y_\star\|^2 / N^2$ ( $O(1/k^2)$ ).
Asymptotic regularity: For the adaptive scheme, $\|x^k - T x^k\| \le 2\|x^0 - x^*\|/(\varphi_k+1)$ with potential superlinear speed due to fast growth of $\varphi_k$ .
Sum-of-squares certificate: The sharp rate is certified by H-invariants and nonnegativity of explicit dual variables (H-certificates). Only such algorithms with these algebraic properties attain the rate, providing a complete characterization (Yoon et al., 18 Nov 2025).

The proof in both cases exploits Fejér monotonicity, careful induction on auxiliary quantities, telescoping series, demiclosedness arguments, and, in the algebraic setting, an SOS (sum-of-squares) decomposition encoding optimality.

7. Significance, Limitations, and Extensions

Halpern's method with optimal parameters represents the extremal convergence regime for fixed-point iterations under nonexpansiveness, generalizing to a rich family of methods classified via the H-invariance/COS formalism. The adaptive scheme removes the requirement for divergence of the parameter sum and often achieves superlinear practical decay, especially in ill-conditioned or highly structured problems.

A plausible implication is that any further improvement in nonexpansive fixed-point problems must come from moving beyond the first-order framework or by incorporating stronger problem structure. Recent advances show Halpern-type iterations forming the “rate backbone” for high-performance algorithms in MDPs and composite optimization (Lee et al., 6 Feb 2025, Yoon et al., 18 Nov 2025, He et al., 16 May 2025). The theory elucidates both the limitations and full expressive power of step-size scheduling in the nonexpansive regime.