Fixed-Point Acceleration Algorithms

Updated 31 January 2026

Fixed-Point Acceleration Algorithms are methods that enhance the convergence speed of iterative fixed-point computations by leveraging residuals and historical iterates.
They employ techniques such as Anderson acceleration, Aitken’s Δ², and quasi-Newton methods to reduce convergence time across linear and nonlinear problems.
These methods are applied in areas like convex optimization, PDE discretizations, and neural meta-learning, markedly improving computational efficiency.

A fixed-point acceleration algorithm is a class of methodological enhancements aimed at expediting the convergence of fixed-point iterations, which are central to nonlinear and convex optimization, scientific computing, and numerous signal processing and control tasks. These algorithms—ranging from classical techniques such as Anderson, Aitken, and quasi-Newton methods, to contemporary neural meta-learned accelerators and polynomial extrapolation—exploit the structure, history, or statistics of the fixed-point map to minimize residuals more rapidly than the base iteration. The field has expanded to encompass both algorithmically optimal schemes for nonexpansive or contractive operators, and practical, adaptively tuned strategies capable of handling nonsmoothities or application-specific distributions.

1. The Fixed-Point Problem and Need for Acceleration

The fixed-point problem consists of finding a vector $x^\star$ such that $f(x^\star) = x^\star$ for a given map $f:\mathbb{R}^n \rightarrow \mathbb{R}^n$ . This abstraction encompasses primal and primal-dual methods (e.g., forward–backward splitting, ADMM), stationary iterative linear solvers, and operator-splitting-based optimization schemes (Saad, 15 Jul 2025). Although Banach's fixed-point theorem ensures geometric convergence for contractive operators, many real-world scenarios involve $f$ with contraction factors close to unity, or only local contractiveness, resulting in slow convergence or stagnation. Classic fixed-point iteration and simple global relaxations (Picard, Mann, Krasnosel'skiĭ) are often insufficient. Acceleration techniques seek to exploit information in the residuals, past iterates, or operator structure to circumvent these intrinsic limitations.

2. Classical Acceleration Algorithms

Anderson Acceleration and Multisecant Methods

Anderson acceleration (AA) constructs each update as a linear combination of recent residuals, minimizing the next residual norm over an affine combination of past step directions (Saad, 15 Jul 2025). The standard finite-memory AA(m) requires, for each $k$ , solving a small least-squares problem over the residuals, typically resulting in "windowed" acceleration with complexity $O(n m^2)$ per step. In the linear setting ( $g(x) = Mx+b$ ), AA(m) is formally equivalent to applying one Richardson step to the GMRES iterate (Tang et al., 2024), and converges r-linearly (with a factor no slower than that of the base iteration). In symmetric cases, truncated Gram-Schmidt (AATGS) can be used to orthogonalize residuals, yielding a robust, low-cost three-term recurrence and further reducing memory and arithmetic requirements (Tang et al., 2024). In nonlinear contexts, global convergence can be obtained for nonexpansive or contractive maps when combined with stabilizing devices (Powell regularization, safe-guarding, and restarts) (Zhang et al., 2018).

Aitken's Δ² and Vector Generalizations

Aitken's Δ² process eliminates geometric error in scalar fixed-point sequences with a closed-form three-sequence formula. Vector extensions apply secant or projection principles, typically constructing a direction-weighted update to improve convergence. Aitken and Anderson accelerations are contained within the broader spectrum of extrapolation to the limit techniques (Aksenov et al., 2020).

Quasi-Newton iterations build a (possibly low-rank) approximation to the inverse Jacobian using multisecant or secant conditions (e.g., Broyden's method), overlapping in interpretation with AA when regarded through the block-multisecant viewpoint (Chen et al., 2023, Saad, 15 Jul 2025). Preconditioned Anderson acceleration (PAA) generalizes AA by inserting a user-supplied or learned preconditioner $M_k$ , thereby interpolating between Picard, quasi-Newton, Newton, and AA methods, yielding rates ranging from linear (quasi-Newton) to quadratic (Newton) (Chen et al., 2023).

3. Modern and Application-Specific Acceleration Schemes

Neural and Meta-learned Acceleration

Neural fixed-point acceleration combines meta-learning with classical acceleration, optimizing a parameterized, recurrent, unrolled update operator specialized to a problem distribution $\mathcal{D}$ (Venkataraman et al., 2021). Given a parametrized map $f(\cdot;\phi)$ for context $\phi\in\mathbb{R}^m$ , the approach learns an initializer and accelerator network that, at each step, consumes the prior iterate, unaccelerated step, and internal state, yielding improved iterates. Training minimizes the expected normalized residual over distributions of convex cone problems; empirical benchmarks on Lasso, robust PCA, and robust Kalman filtering demonstrate significant iteration reductions and improved residual stability compared to Anderson and baseline SCS solvers.

Adaptive and Polynomial Extrapolation Methods

Three-point polynomial accelerators (TPA) estimate the dominant contraction factor from the recent residuals and compute a quadratic blend of the last three iterates to cancel the most persistent error mode, particularly effective for linear systems or nonlinear problems with dominant slow eigenmodes (Alemanno, 12 Nov 2025). Alternating cyclic extrapolation (ACX) methods compute second- and third-order (Aitken-type) blends, cycling between squared and cubic extrapolations with dynamically determined step-lengths, yielding Q-linear convergence in the positive-definite linear case and outperforming other black-box methods in high-dimensional applications (Lepage-Saucier, 2021).

Smoothing and Nonsmooth Acceleration

For nonsmooth, contractive fixed-point mappings, smoothing Anderson algorithms replace the nonsmooth operator with a parameterized smoothed contraction, performing Anderson acceleration on the approximation and adaptively reducing the smoothing parameter as the residuals decrease. This ensures r-linear (or even q-linear for depth-1) convergence rates matching the contractive factor $c$ of the original map (Li et al., 2024). These methods empirically outperform non-accelerated and smoothing-free Anderson variants on elastic-net regularization, free-boundary problems, and nonnegative least squares.

4. Theoretical Guarantees and Complexity Optimality

Optimal complexity for fixed-point acceleration has been established in the form of the optimized Halpern iteration (OC-Halpern) (Park et al., 2022), which, for contractive ( $L<1$ ) or nonexpansive ( $L=1$ ) operators, achieves exactly the rate

$\|y_N - T(y_N)\|^2 \leq (1+1/\gamma)^2 \left(\sum_{k=0}^N \gamma^k\right)^{-2} \|y_0 - y_*\|^2,$

with $\gamma = 1/L$ , and matches lower bounds for span-restricted or unrestricted deterministic methods. For nonexpansive cases, the optimal rate is $O(1/N^2)$ (Park et al., 2022, Yoon et al., 2024). H-duality establishes that a whole family of anchoring- and momentum-based optimal accelerators exist, all with matching worst-case guarantees but different practical properties (Yoon et al., 2024).

For Anderson acceleration, root-linear convergence is typical for windowed schemes, but the precise convergence factor depends on the initial condition and is not globally determined by the Jacobian spectrum (Sterck et al., 2021). Windowed AA improves the root-linear convergence factor for symmetric (or locally symmetric Jacobian) fixed-point maps compared to base Picard, with a provably strictly smaller rate except in symmetric degenerate cases (Garner et al., 2023). Smoothing Anderson(m) achieves r-linear convergence with factor at most $c$ under minimal local regularity and boundedness of mixing coefficients (Li et al., 2024).

5. Computational and Implementation Considerations

Method/Class	Memory/Step Cost	Per-iteration Overhead	Locality/Globality
AA(m) (Type II)	$O(nm)$	$O(nm^2)$ , LS solve	Local, superlinear
Preconditioned AA	$O(nm)+$ prec.	$O(nm^2)+$ prec. solve	Tunable
AATGS (Gram-Schmidt)	$O(nm)$	$O(nm)$ , orthogonalization	Robust to ill-posed LS
ACX	$O(n)$	$2$–$3$ map calls/step	Tuned-free, robust
Three-point TPA	$O(n)$	$O(n)$ , $2$ map calls	Best for slow mode
Neural Accelerator	$O(\text{model})$	Training/Inference cost	Data-driven, high perf

Key issues include the selection of memory/order parameter $m$ , storage and factorization of Jacobian approximations for quasi-Newton/preconditioned methods, and stabilization by regularization or restart. For Anderson/multisecant schemes, regularization of the least-squares (or Gram–Schmidt) problem is critical. Adaptive relaxation factors (e.g., in AA with $\beta_k$ ) and nonmonotone safeguard line searches (for global convergence in nonsmooth/nonconvex contexts (Li, 2024)) are vital in practice.

6. Practical Applications and Performance

Accelerated fixed-point algorithms have demonstrated substantial empirical gains in applications such as

large-scale convex optimization (e.g., cone programming with neural accelerators (Venkataraman et al., 2021))
hydrodynamic coupling of well/fracture/reservoir subsystems (Aksenov et al., 2020)
imaging (PDHG for total-variation regularized CT (Park et al., 2022))
sparse regression and signal recovery (IRL1 (Li, 2024), nonnegative least squares (Li et al., 2024))
dense and sparse PDE discretizations (including Helmholtz and Poisson equations (Yang et al., 2020, Alemanno, 12 Nov 2025))
Markov Decision Processes and value iteration (Akian et al., 2020)
robust M-estimation (Tyler's estimator (Garner et al., 2023))
high-dimensional optimization in machine learning and deep equilibrium models

Typical observed iteration or CPU time reductions range from $5\times$ for windowed AA over Picard to $600\times$ for Anderson over Picard in challenging hydro-fracture simulations (Aksenov et al., 2020), and neural/meta-learned accelerators can halve iteration counts compared to optimized Anderson in standardized convex problems (Venkataraman et al., 2021).

7. Limitations, Open Questions, and Extensions

While optimal theoretical rates exist for contractive and nonexpansive operators, practical convergence—especially for nonlinear, nonsmooth, or adaptive methods—depends on the detailed structure, residual behavior, and choice of acceleration parameters. Existing neural fixed-point accelerators lack global convergence guarantees and may fail out-of-distribution. High-memory Anderson and other multisecant methods face potential instability from ill-conditioned systems, motivating advances like Gram-Schmidt AA, adaptive regularization, and dynamic window selection (Tang et al., 2024, Lepage-Saucier, 2024).

Open issues include deriving explicit a priori convergence factors for windowed or restarted Anderson acceleration in the nonlinear regime, scalable GPU-based implementations for large-scale differentiable convex-cone solvers, and the systematic combination of classical and meta-learned acceleration schemes to simultaneously leverage data structure and operator-theoretical properties (Venkataraman et al., 2021, Park et al., 2022, Yoon et al., 2024). Uniqueness of optimal acceleration is now falsified—multiple, non-equivalent optimal schemes exist, each with distinct implementation trade-offs and early-stopping behaviors (Yoon et al., 2024).

Potential extensions span equilibrium models in deep learning, sophisticated preconditioning in large sparse systems, and hybridization of spectral/PDE-informed acceleration schemes (e.g., Sobolev-norm weighted AA for elliptic problems (Yang et al., 2020)) with learned or adaptive strategies.

References:

(Venkataraman et al., 2021) Neural Fixed-Point Acceleration for Convex Optimization
(Aksenov et al., 2020) Application of accelerated fixed-point algorithms to hydrodynamic well-fracture coupling
(Park et al., 2022) Exact Optimal Accelerated Complexity for Fixed-Point Iterations
(Yang et al., 2020) Anderson Acceleration Based on the $\mathcal{H}^{-s}$ Sobolev Norm for Contractive and Noncontractive Fixed-Point Operators
(Chen et al., 2023) A short report on preconditioned Anderson acceleration method
(Alemanno, 12 Nov 2025) A polynomially accelerated fixed-point iteration for vector problems
(Zhu, 1 Nov 2025) Accelerated primal dual fixed point algorithm
(Sterck et al., 2021) Linear Asymptotic Convergence of Anderson Acceleration: Fixed-Point Analysis
(Garner et al., 2023) Improved Convergence Rates of Windowed Anderson Acceleration for Symmetric Fixed-Point Iterations
(Li, 2024) Anderson acceleration for iteratively reweighted $\ell_1$ algorithm
(Lepage-Saucier, 2021) Alternating cyclic extrapolation methods for optimization algorithms
(Zhang et al., 2018) Globally Convergent Type-I Anderson Acceleration for Non-Smooth Fixed-Point Iterations
(Lepage-Saucier, 2024) Anderson acceleration with adaptive relaxation for convergent fixed-point iterations
(Tang et al., 2024) Anderson Acceleration with Truncated Gram-Schmidt
(Li et al., 2024) A smoothing Anderson acceleration algorithm for nonsmooth fixed point problem with linear convergence
(Yoon et al., 2024) Optimal Acceleration for Minimax and Fixed-Point Problems is Not Unique
(Saad, 15 Jul 2025) Acceleration methods for fixed point iterations