Papers
Topics
Authors
Recent
Search
2000 character limit reached

Arnoldi-Based Minimisation Method

Updated 10 February 2026
  • Arnoldi-based minimisation method is a Krylov subspace technique that constructs an orthonormal basis to transform complex minimisation problems into well-conditioned, smaller projected systems.
  • It leverages Arnoldi iterations to reduce high-dimensional linear and inverse problems, facilitating efficient and stable least-squares and regularised solutions.
  • The method enables robust optimisation under noisy and ill-posed conditions, with applications in data fitting, model reduction, and stochastic optimisation.

The Arnoldi-based minimisation method encompasses a spectrum of techniques exploiting Arnoldi's process within Krylov subspace frameworks for solving minimisation problems associated with linear systems, rational matrix functions, ill-posed inverse problems, high-dimensional optimisation, least-squares fitting, and model reduction. Central to all such methods is the use of the Arnoldi iteration to construct an orthonormal basis for Krylov subspaces, providing a well-conditioned coordinate system for efficient, reliable subspace-projected minimisation. These techniques have demonstrated significant efficacy for large-scale, high-dimensional, and/or ill-conditioned problems, especially where explicit operators (e.g., Hessians) are unavailable, or in the presence of noise and numerical instability.

1. The Arnoldi Process: Generation of Krylov Subspaces

The Arnoldi process generates an orthonormal basis {v1,,vm}\{v_1,\ldots,v_m\} for the Krylov subspace Km(A,b)=span{b,Ab,,Am1b}\mathcal{K}_m(A,b) = \mathrm{span}\{b, Ab, \ldots, A^{m-1}b\} associated with a matrix AA and a starting vector bb (Gazzola et al., 2018). At each iteration jj,

w=Avj,hi,j=viw,wwi=1jvihi,j,hj+1,j=w,vj+1=w/hj+1,j.w = A v_j,\quad h_{i,j} = v_i^*w,\quad w \leftarrow w - \sum_{i=1}^j v_i h_{i,j},\quad h_{j+1,j} = \|w\|,\quad v_{j+1} = w/h_{j+1,j}.

The process yields an orthonormal basis Vm+1=[v1,,vm+1]V_{m+1} = [v_1, \ldots, v_{m+1}] and an upper Hessenberg matrix Hm+1,mH_{m+1,m} satisfying AVm=Vm+1Hm+1,mA V_m = V_{m+1} H_{m+1,m}.

Extensions exist for functionals of matrices (e.g., rational functions, shifted-inverse forms), for approximating matrix exponentials, and for constructing rational Krylov spaces when denominator structures or regularization are imposed (Chen et al., 2023, Brezinski et al., 2010).

2. Arnoldi-Based Minimisation Frameworks

2.1 Linear and Regularised Least-Squares

For linear systems Ax=bAx=b, the GMRES algorithm seeks xmKm(A,b)x_m\in\mathcal{K}_m(A,b) minimising Axb2\|A x - b\|_2 (Gazzola et al., 2018), which reduces to a small least-squares problem over yRmy\in\mathbb{R}^m using the Arnoldi basis: ym=argminyHm+1,myβe12,xm=Vmym.y_m = \arg\min_{y}\|H_{m+1,m} y - \beta e_1\|_2,\quad x_m = V_m y_m. Arnoldi-based variants regularise the subspace problem, for example:

  • Arnoldi–Tikhonov: argminyHm+1,myβe122+λy22\arg\min_y \|H_{m+1,m} y - \beta e_1\|_2^2 + \lambda\|y\|_2^2.
  • Arnoldi–TSVD: Truncates small singular values of Hm+1,mH_{m+1,m} to stabilise solutions to ill-posed problems.

2.2 Rational Function Approximations and Matrix Functions

Given a rational function R(z)=D(z)1N(z)R(z) = D(z)^{-1} N(z) and matrix AA, one seeks xkKk(A,b)x_k\in\mathcal{K}_k(A,b) minimising R(A)bxkD(A)=N(A)bD(A)xk2\|R(A)b-x_k\|_{D(A)} = \|N(A)b-D(A)x_k\|_2 in the D(A)D(A)D(A)^*D(A)-norm (Chen et al., 2023). The Arnoldi-OR method extends Arnoldi for this setting, requiring extra ν=max{degD,degN}\nu = \max\{\deg D, \deg N\} steps and solving a (k+ν)×k(k+\nu)\times k least-squares problem. The process uses only small blocks of Hessenberg matrices, avoiding explicit evaluation of high-degree matrix polynomials. For R(z)=1/zR(z) = 1/z, Arnoldi-OR reduces to GMRES.

Rational Arnoldi methods (RA) recast solutions of regularised systems or matrix equations as f(Z)bf(Z)b, with ZZ a shifted-inverse or regularised operator, and ff a rational or matrix function; xm=bVmf(Hm)e1x_m = \|b\| V_m f(H_m) e_1 (Brezinski et al., 2010).

2.3 High-Dimensional Optimisation Under Noisy Data

The Stochastic Arnoldi Method (SAM) addresses optimisation of f(x)f(x) in high-dimensional, noisy settings (Hicken et al., 2015). Arnoldi sampling, via finite-difference Hessian-vector approximations of noisy gradients, constructs a low-rank approximation of the Hessian in the subspace spanned by dominant Ritz vectors. A quadratic trust-region model is built in this subspace, using either step-averaged or directional-derivative-based linear parts, and trust-region subspace methods (e.g., Moré–Sorensen) are employed to update iterates.

2.4 Polynomial and Rational Least-Squares, Structured Fitting

In polynomial, rational, and Sobolev least-squares problems, Arnoldi orthogonalization renders the ill-conditioned Vandermonde (or Cauchy) coefficient matrix as a Krylov subspace basis, reducing the overdetermined system

Bn+1cWfB_{n+1} c \approx W f

to a small, well-conditioned upper-triangular (or block pentadiagonal) system (Faghih et al., 2024). For multivariate polynomial and Hermite least-squares, the G-Arnoldi method constructs a GG-orthonormal basis appropriate for weighted and application-driven inner products (Zhang et al., 2024).

2.5 Aggregation and Model Reduction for Markov Chains

Arnoldi aggregation projects the evolution of a Markov chain, defined by p(t)=p(0)exp(Qt)p(t)^\top = p(0)^\top \exp(Q t), onto the Krylov span generated by p(0)p(0)^\top and QQ, providing a low-dimensional system HjH_j governing the projected dynamics. The approximation papprox(t)=βe1exp(tHj)Vjp_\mathrm{approx}(t)^\top = \beta e_1^\top \exp(t H_j) V_j is minimal, provably exact when an invariant subspace is reached, and allows for sharp, residual-based error estimation (Sonnentag et al., 4 Aug 2025).

3. Key Algorithmic and Structural Elements

Arnoldi-based minimisation methods share several algorithmic features, often tailored for specific problem classes:

  • Orthonormal Basis Construction: Modified Gram–Schmidt, with application-aware inner products (e.g., GG-inner products for PDEs or data weighting).
  • Small Projected Systems: The main minimisation problem in the full space is projected to a low-dimensional system (Hessenberg, block, or banded), exploiting the structure of VmV_m and HmH_m (or (H,K)(H,K) pencils in rational or confluent cases).
  • Implicit Function Action: Matrix functions f(A)f(A), rational functions, or exponentials are applied in the reduced subspace via f(Hm)f(H_m), circumventing the need for full-matrix computations.
  • Finite-Difference/Subsampled Approximations: For non-explicit operators (e.g., Hessian in optimisation), sampled Hessian-vector products or directional derivatives are used.
  • Short Recurrences: For least-squares and fitting, Arnoldi-generated discrete orthogonal basis functions satisfy explicit recurrences, accelerating evaluation and stability (Zhang et al., 2024).
  • Residual and Stopping Criteria: Residuals in the last Arnoldi direction or reductions in model error (e.g., trust-region model vs. actual function decrement) provide natural stopping rules and adaptivity (Sonnentag et al., 4 Aug 2025, Hicken et al., 2015).

4. Theoretical Properties and Numerical Behavior

Table: Principal Theoretical Features and Classes

Class Optimality/Exactness Conditioning Error Analysis
GMRES / Linear System (Gazzola et al., 2018) 2\ell_2-minimal residual in subspace Improved by preconditioning Spectrum, normality, and projections
Arnoldi-OR / Rational Function (Chen et al., 2023) D(A)D(A)D(A)^*D(A)-minimal residual Avoids explicit R(A)R(A), uses residual norm Crouzeix–Palencia, Ritz values
SAM / Stochastic Optimisation (Hicken et al., 2015) Trust-region quadratic in dominant subspace Low-rank, robust to noise Bounded/truncated noise, empirical
Polynomial/Rational LS (Faghih et al., 2024) LS-optimal in polynomial/rational Krylov Dramatically stabilised vs. Vandermonde Orthogonal polynomials/rationals
Arnoldi Aggregation / Markov (Sonnentag et al., 4 Aug 2025) Provably exact on invariant subspace Minimal dimension for accurate expansion Residual bounds, superlinear decay

For linear problems, minimisation in the Arnoldi Krylov subspace is classically optimal with respect to the chosen norm, with small projected systems advantageous for both accuracy and efficiency. Preconditioning and regularisation within the Arnoldi framework have proven effective at mitigating ill-posedness and semi-convergence (Gazzola et al., 2018, Brezinski et al., 2010). For rational approximations and matrix functions, sharp spectral-set bounds and the limitations of eigenvalue-based heuristics are established (Chen et al., 2023).

In stochastic or high-dimensional optimisation, the robustness of the model to noise is contingent on spectral decay of the Hessian and subspace parameter choices (Hicken et al., 2015). Empirically, Arnoldi-based minimisation demonstrates accelerated reduction of error and residual norm relative to naive approaches, particularly in state-space reduction and data-fitting settings (Sonnentag et al., 4 Aug 2025, Faghih et al., 2024, Zhang et al., 2024).

5. Parameter Selection, Limitations, and Practical Guidelines

Parameter choices are application-dependent but the following guidelines emerge across classes:

  • Arnoldi subspace dimension (mm, kk): 4–10 (optimisation), $2$–$3$ times larger for stable eigen-estimates (spectrum estimation for Hessian), or until diminishing returns in model error/residual.
  • Sampling radius in optimisation (α\alpha): Chosen to match expected trust-region scale or expected noise level (Hicken et al., 2015).
  • Preconditioning: Application-specific; circulant or low-rank operators can be derived from initial Arnoldi samples (Gazzola et al., 2018).
  • Orthogonality and Stability: Re-orthogonalisation (“twice is enough”) can be employed in polynomial bases at the cost of increased flops (Faghih et al., 2024).
  • Choice of weights and poles (rational/polynomial LS): In weighted least-squares, quadrature-based weights enhance accuracy; adaptive pole selection for rational Arnoldi remains an open research direction (Faghih et al., 2024).
  • Computational costs: Each Arnoldi step involves a matrix-vector product and orthogonalisation (dominating cost for large systems).

Limitations include finite-precision breakdown for very large degrees or insufficiently decaying spectra, potential loss of orthogonality, and (in rational settings) sensitivity to pole placement.

6. Impact and Applications Across Disciplines

Arnoldi-based minimisation methods are widely adopted in scientific computing, optimisation, statistical inference, inverse problems, signal processing, uncertainty quantification, and model reduction. Applications include:

The unifying feature of Arnoldi-based minimisation is the reformation of diverse but challenging minimisation tasks into well-conditioned, low-dimensional subspace problems, enabled by the algebraic and numerical properties of Krylov sequences and Arnoldi-type recurrences.

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Arnoldi-Based Minimisation Method.