Polynomial Extrapolation Methods

Updated 7 February 2026

Polynomial-type extrapolation methods are deterministic techniques that improve approximations by fitting polynomials to cancel leading error components.
These methods, including MPE and RRE, leverage algebraic structures in iterates to achieve quadratic convergence and robust acceleration in various settings.
They are applied in scalar, vector, matrix, and tensor problems, offering efficient error reduction and reduced computational cost for large-scale nonlinear systems.

Polynomial-type extrapolation methods comprise a set of deterministic acceleration techniques that construct improved approximations to limits or function values by explicitly exploiting the algebraic structure of errors. These methods operate by fitting polynomials—often of minimal or reduced rank—to blocks of iterates, function samples, or solution approximations, so as to cancel or project out dominant error components. The approach yields unified algorithms for interpolation, extrapolation, and convergence acceleration in both scalar and high-dimensional contexts, including vector, matrix, and tensor settings. The foundational algorithms—such as minimal polynomial extrapolation (MPE) and reduced rank extrapolation (RRE)—have deep links to Krylov subspace theory and least-squares projection, and admit analysis showing sharp theoretical and practical gains over unaccelerated iterative methods or naive polynomial interpolation.

1. Theoretical Foundations and Core Algorithms

Polynomial-type extrapolation methods operate by imposing annihilation or orthogonality conditions on the forward differences of a sequence or on the error model of approximate solutions. For a vector sequence $\{s_k\}$ produced by a fixed-point map $s_{k+1}=G(s_k)$ , the general extrapolated approximation of degree $q$ is given by

$t_{k,q} = \sum_{j=0}^q \gamma_j^{(q)} s_{k+j}, \qquad \sum_{j=0}^q \gamma_j^{(q)} = 1$

subject to the conditions

$\sum_{j=0}^q \eta_{i,j} \gamma_j^{(q)} = 0, \quad i = 0,\ldots,q-1$

where $\eta_{i,j}$ depends on the choice of method:

For RRE: $\eta_{i,j} = (\Delta^2 s_{k+i}, \Delta^2 s_{k+j})$
For MPE: $\eta_{i,j} = (\Delta s_{k+i}, \Delta s_{k+j})$

These conditions ensure that the resulting $t_{k,q}$ cancels or projects out $q$ leading error modes, leading to accelerated convergence or enhanced extrapolation accuracy (Mouhssine et al., 5 Jan 2025).

A compact projector formula for both RRE and MPE is

$t_{k,q} = s_k - \Delta S_{k,q} \big(Y_q^{T} \Delta^2 S_{k,q}\big)^{-1} Y_q^{T} \Delta s_k$

where $\Delta S_{k,q}$ is a block of forward differences, and $Y_q$ represents either first or second differences according to the method. The backward-difference or "history-based" formulation is also common, especially in implementations using limited memory or restarting.

In the scalar case, Richardson extrapolation and related techniques follow analogous linear systems on error models of the form $S(h) = S^* + c_1 h^p + c_2 h^{p+1} + \dots$ , with coefficients constructed to eliminate powers of the discretization parameter, achieving higher-order convergence (Jbilou, 1 Feb 2026).

2. Convergence, Error Analysis, and Residual Norm Estimates

Under standard nonlinear fixed-point hypotheses— $G'(x^*)$ Lipschitz near the fixed point, and $(J-I)$ invertible where $J = G'(x^*)$ —restart strategies based on polynomial extrapolation (using RRE or MPE with window equal to the minimal polynomial degree of $J$ ) achieve quadratic local convergence,

$\|x_{k+1} - x^*\| = O(\|x_k - x^*\|^2)$

with no extra assumptions required on the extrapolation coefficients (Mouhssine et al., 5 Jan 2025). The approach leverages the minimal polynomial property and a local Taylor expansion to reduce the dominant linear error component, with the proof using characterization of the generalized residual and its projection.

There is a direct correlation between the residual norm and the fixed-point error norm: $C_1 \|e_k\| \leq \|r_k\| \leq C_2 \|e_k\|$ for fixed $J$ , with explicit constants depending on the smallest and largest singular values of $J-I$ . The extrapolation step ensures that the residual for the extrapolated iterate is $O(\|r_k\|^2)$ , i.e., quadratic in the previous residual norm.

A computable a-posteriori estimate for the RRE generalized residual norm is available: $\|\tilde r(x_{k+1})\|^2 = \frac{1}{e^T (RQ)^{-1} e}$ where $R$ arises from a thin-QR factorization of the augmented difference history, and $e$ is the vector of ones. This explicit formula does not require forming the large projector explicitly and is numerically stable (Mouhssine et al., 5 Jan 2025).

3. Implementation Approaches and Algorithmic Structure

Polynomial extrapolation is implemented as an acceleration wrapper around an iterative solver, for instance within a restarted Picard or fixed-point loop. The procedure is as follows (Mouhssine et al., 5 Jan 2025):

Starting from $x_k$ , generate $q+1$ iterates $s_0 = x_k$ , $s_{i+1} = G(s_i)$ .
Form the collection of differences $\Delta s_i = s_{i+1} - s_i$ .
Assemble a block difference matrix (and its QR factorization) for the extrapolation window.
Solve a small linear system (upper-triangular or least-squares) to determine the weights $\gamma_j$ .
Form the backward-difference coefficients $\alpha_j$ and compute

$x_{k+1} = s_0 + Q_{:,0:q}(R_{1:q,1:q} \alpha_{0:q-1})$

Check for convergence and restart or continue as needed.

For both RRE and MPE, the extrapolation step adds only a small marginal computational cost relative to each batch of iterative steps, while delivering substantial gains in rate and robustness.

4. Numerical Performance, Practical Significance, and Comparative Analysis

Comprehensive numerical experiments confirm the practical advantages of applying polynomial-type extrapolation to large-scale nonlinear problems. For example, on challenging Bratu and Monge–Ampère problems discretized with isogeometric multigrid, the following outcomes are reported (Mouhssine et al., 5 Jan 2025):

Standard Picard–MG may stall or diverge for large nonlinearity parameter $\lambda$ or fine mesh.
Anderson acceleration, while more robust, attains only linear convergence and incurs higher memory cost per step as the depth increases.
RRE/MPE-accelerated Picard–MG achieves quadratic convergence, iteration counts independent of mesh or spline degree, and robust performance for $\lambda$ beyond known critical thresholds.

Representative timings (1D Bratu, $\lambda=7$ , $p=5$ , $N=64$ ): | Method | Iterations | CPU sec | |------------------|------------|---------| | Picard–MG | >1000 (fail) | -- | | AA(5)–Picard–MG | 55 | 1.36 | | RRE(5)–Picard–MG | 25 | 0.72 | | MPE(5)–Picard–MG | 25 | 0.69 |

Across 2D/3D problems and for various parameter regimes, the structure of results remains: Anderson acceleration may achieve convergence but at higher computational cost and lower local order, while polynomial extrapolation maintains quadratic convergence and consistently lower iteration and CPU cost.

5. Generalizations and Connections to Higher-Dimensional, Krylov, and Tensor Methods

The mathematical structure of polynomial polynomial-type extrapolation methods is preserved and extended in high-dimensional and tensor frameworks. For example, tensor analogues such as Tensor Reduced Rank Extrapolation (TRRE) and Tensor Minimal Polynomial Extrapolation (TMPE) employ generalized difference and projection operations in the context of the T-product and tensor–Krylov subspaces (Beik et al., 2020). In such tensor settings, the auxiliary difference blocks and orthogonality constraints continue to yield subspace-projection methods that accelerate convergence or regularize solutions in ill-posed regimes. Moreover, polynomial extrapolation can be naturally embedded in large-scale matrix or tensor solvers, where it operates alongside truncated SVD or low-rank decompositions, to expedite convergence in applications ranging from PDE-constrained optimization to large-scale nonlinear least-squares.

The connection of polynomial extrapolation methods to Krylov subspace processes such as GMRES and FOM is explicit: both MPE and RRE can be viewed as black-box generalizations of these solvers to nonlinear or nonstationary vector sequences, matching their finite-termination and minimal-residual properties when specialized to linear systems (Sidi, 2015).

6. Significance and Best-Practice Recommendations

Polynomial-type extrapolation methods are distinguished by:

A requirement for only a fixed window ("restart length") of prior iterates, making them well-suited for memory-constrained settings and parallel-in-time or domain-decomposed solvers.
Reliance solely on the algebraic structure of history blocks, and not on further operator-specific information, elevating their applicability to complex, nonlinear, or even black-box iterative processes.
Quadratic convergence and robust error control under mild regularity and invertibility hypotheses.
Cheap and scalable implementation: the main per-cycle computational cost is that of a small linear system or least-squares problem, plus block difference arithmetic and QR updates.

These properties position polynomial extrapolation—especially RRE and MPE—as methods of choice for accelerating multigrid solvers, nonlinear fixed-point iterations, and preconditioned gradient schemes in high-accuracy and large-scale computational science (Mouhssine et al., 5 Jan 2025).

In broad summary, polynomial-type extrapolation provides a principled framework for acceleration, error reduction, and robust convergence across a wide spectrum of nonlinear, linear, and high-dimensional problems. It complements and often surpasses established acceleration schemes, while maintaining principled theoretical guarantees and straightforward numerical implementation.