Arnoldi-Based Minimisation Method

Updated 10 February 2026

Arnoldi-based minimisation method is a Krylov subspace technique that constructs an orthonormal basis to transform complex minimisation problems into well-conditioned, smaller projected systems.
It leverages Arnoldi iterations to reduce high-dimensional linear and inverse problems, facilitating efficient and stable least-squares and regularised solutions.
The method enables robust optimisation under noisy and ill-posed conditions, with applications in data fitting, model reduction, and stochastic optimisation.

The Arnoldi-based minimisation method encompasses a spectrum of techniques exploiting Arnoldi's process within Krylov subspace frameworks for solving minimisation problems associated with linear systems, rational matrix functions, ill-posed inverse problems, high-dimensional optimisation, least-squares fitting, and model reduction. Central to all such methods is the use of the Arnoldi iteration to construct an orthonormal basis for Krylov subspaces, providing a well-conditioned coordinate system for efficient, reliable subspace-projected minimisation. These techniques have demonstrated significant efficacy for large-scale, high-dimensional, and/or ill-conditioned problems, especially where explicit operators (e.g., Hessians) are unavailable, or in the presence of noise and numerical instability.

1. The Arnoldi Process: Generation of Krylov Subspaces

The Arnoldi process generates an orthonormal basis $\{v_1,\ldots,v_m\}$ for the Krylov subspace $\mathcal{K}_m(A,b) = \mathrm{span}\{b, Ab, \ldots, A^{m-1}b\}$ associated with a matrix $A$ and a starting vector $b$ (Gazzola et al., 2018). At each iteration $j$ ,

$w = A v_j,\quad h_{i,j} = v_i^*w,\quad w \leftarrow w - \sum_{i=1}^j v_i h_{i,j},\quad h_{j+1,j} = \|w\|,\quad v_{j+1} = w/h_{j+1,j}.$

The process yields an orthonormal basis $V_{m+1} = [v_1, \ldots, v_{m+1}]$ and an upper Hessenberg matrix $H_{m+1,m}$ satisfying $A V_m = V_{m+1} H_{m+1,m}$ .

Extensions exist for functionals of matrices (e.g., rational functions, shifted-inverse forms), for approximating matrix exponentials, and for constructing rational Krylov spaces when denominator structures or regularization are imposed (Chen et al., 2023, Brezinski et al., 2010).

2. Arnoldi-Based Minimisation Frameworks

2.1 Linear and Regularised Least-Squares

For linear systems $Ax=b$ , the GMRES algorithm seeks $x_m\in\mathcal{K}_m(A,b)$ minimising $\|A x - b\|_2$ (Gazzola et al., 2018), which reduces to a small least-squares problem over $y\in\mathbb{R}^m$ using the Arnoldi basis: $y_m = \arg\min_{y}\|H_{m+1,m} y - \beta e_1\|_2,\quad x_m = V_m y_m.$ Arnoldi-based variants regularise the subspace problem, for example:

Arnoldi–Tikhonov: $\arg\min_y \|H_{m+1,m} y - \beta e_1\|_2^2 + \lambda\|y\|_2^2$ .
Arnoldi–TSVD: Truncates small singular values of $H_{m+1,m}$ to stabilise solutions to ill-posed problems.

2.2 Rational Function Approximations and Matrix Functions

Given a rational function $R(z) = D(z)^{-1} N(z)$ and matrix $A$ , one seeks $x_k\in\mathcal{K}_k(A,b)$ minimising $\|R(A)b-x_k\|_{D(A)} = \|N(A)b-D(A)x_k\|_2$ in the $D(A)^*D(A)$ -norm (Chen et al., 2023). The Arnoldi-OR method extends Arnoldi for this setting, requiring extra $\nu = \max\{\deg D, \deg N\}$ steps and solving a $(k+\nu)\times k$ least-squares problem. The process uses only small blocks of Hessenberg matrices, avoiding explicit evaluation of high-degree matrix polynomials. For $R(z) = 1/z$ , Arnoldi-OR reduces to GMRES.

Rational Arnoldi methods (RA) recast solutions of regularised systems or matrix equations as $f(Z)b$ , with $Z$ a shifted-inverse or regularised operator, and $f$ a rational or matrix function; $x_m = \|b\| V_m f(H_m) e_1$ (Brezinski et al., 2010).

2.3 High-Dimensional Optimisation Under Noisy Data

The Stochastic Arnoldi Method (SAM) addresses optimisation of $f(x)$ in high-dimensional, noisy settings (Hicken et al., 2015). Arnoldi sampling, via finite-difference Hessian-vector approximations of noisy gradients, constructs a low-rank approximation of the Hessian in the subspace spanned by dominant Ritz vectors. A quadratic trust-region model is built in this subspace, using either step-averaged or directional-derivative-based linear parts, and trust-region subspace methods (e.g., Moré–Sorensen) are employed to update iterates.

2.4 Polynomial and Rational Least-Squares, Structured Fitting

In polynomial, rational, and Sobolev least-squares problems, Arnoldi orthogonalization renders the ill-conditioned Vandermonde (or Cauchy) coefficient matrix as a Krylov subspace basis, reducing the overdetermined system

$B_{n+1} c \approx W f$

to a small, well-conditioned upper-triangular (or block pentadiagonal) system (Faghih et al., 2024). For multivariate polynomial and Hermite least-squares, the G-Arnoldi method constructs a $G$ -orthonormal basis appropriate for weighted and application-driven inner products (Zhang et al., 2024).

2.5 Aggregation and Model Reduction for Markov Chains

Arnoldi aggregation projects the evolution of a Markov chain, defined by $p(t)^\top = p(0)^\top \exp(Q t)$ , onto the Krylov span generated by $p(0)^\top$ and $Q$ , providing a low-dimensional system $H_j$ governing the projected dynamics. The approximation $p_\mathrm{approx}(t)^\top = \beta e_1^\top \exp(t H_j) V_j$ is minimal, provably exact when an invariant subspace is reached, and allows for sharp, residual-based error estimation (Sonnentag et al., 4 Aug 2025).

3. Key Algorithmic and Structural Elements

Arnoldi-based minimisation methods share several algorithmic features, often tailored for specific problem classes:

Orthonormal Basis Construction: Modified Gram–Schmidt, with application-aware inner products (e.g., $G$ -inner products for PDEs or data weighting).
Small Projected Systems: The main minimisation problem in the full space is projected to a low-dimensional system (Hessenberg, block, or banded), exploiting the structure of $V_m$ and $H_m$ (or $(H,K)$ pencils in rational or confluent cases).
Implicit Function Action: Matrix functions $f(A)$ , rational functions, or exponentials are applied in the reduced subspace via $f(H_m)$ , circumventing the need for full-matrix computations.
Finite-Difference/Subsampled Approximations: For non-explicit operators (e.g., Hessian in optimisation), sampled Hessian-vector products or directional derivatives are used.
Short Recurrences: For least-squares and fitting, Arnoldi-generated discrete orthogonal basis functions satisfy explicit recurrences, accelerating evaluation and stability (Zhang et al., 2024).
Residual and Stopping Criteria: Residuals in the last Arnoldi direction or reductions in model error (e.g., trust-region model vs. actual function decrement) provide natural stopping rules and adaptivity (Sonnentag et al., 4 Aug 2025, Hicken et al., 2015).

4. Theoretical Properties and Numerical Behavior

Table: Principal Theoretical Features and Classes

Class	Optimality/Exactness	Conditioning	Error Analysis
GMRES / Linear System (Gazzola et al., 2018)	$\ell_2$ -minimal residual in subspace	Improved by preconditioning	Spectrum, normality, and projections
Arnoldi-OR / Rational Function (Chen et al., 2023)	$D(A)^*D(A)$ -minimal residual	Avoids explicit $R(A)$ , uses residual norm	Crouzeix–Palencia, Ritz values
SAM / Stochastic Optimisation (Hicken et al., 2015)	Trust-region quadratic in dominant subspace	Low-rank, robust to noise	Bounded/truncated noise, empirical
Polynomial/Rational LS (Faghih et al., 2024)	LS-optimal in polynomial/rational Krylov	Dramatically stabilised vs. Vandermonde	Orthogonal polynomials/rationals
Arnoldi Aggregation / Markov (Sonnentag et al., 4 Aug 2025)	Provably exact on invariant subspace	Minimal dimension for accurate expansion	Residual bounds, superlinear decay

For linear problems, minimisation in the Arnoldi Krylov subspace is classically optimal with respect to the chosen norm, with small projected systems advantageous for both accuracy and efficiency. Preconditioning and regularisation within the Arnoldi framework have proven effective at mitigating ill-posedness and semi-convergence (Gazzola et al., 2018, Brezinski et al., 2010). For rational approximations and matrix functions, sharp spectral-set bounds and the limitations of eigenvalue-based heuristics are established (Chen et al., 2023).

In stochastic or high-dimensional optimisation, the robustness of the model to noise is contingent on spectral decay of the Hessian and subspace parameter choices (Hicken et al., 2015). Empirically, Arnoldi-based minimisation demonstrates accelerated reduction of error and residual norm relative to naive approaches, particularly in state-space reduction and data-fitting settings (Sonnentag et al., 4 Aug 2025, Faghih et al., 2024, Zhang et al., 2024).

5. Parameter Selection, Limitations, and Practical Guidelines

Parameter choices are application-dependent but the following guidelines emerge across classes:

Arnoldi subspace dimension ( $m$ , $k$ ): 4–10 (optimisation), $2$–$3$ times larger for stable eigen-estimates (spectrum estimation for Hessian), or until diminishing returns in model error/residual.
Sampling radius in optimisation ( $\alpha$ ): Chosen to match expected trust-region scale or expected noise level (Hicken et al., 2015).
Preconditioning: Application-specific; circulant or low-rank operators can be derived from initial Arnoldi samples (Gazzola et al., 2018).
Orthogonality and Stability: Re-orthogonalisation (“twice is enough”) can be employed in polynomial bases at the cost of increased flops (Faghih et al., 2024).
Choice of weights and poles (rational/polynomial LS): In weighted least-squares, quadrature-based weights enhance accuracy; adaptive pole selection for rational Arnoldi remains an open research direction (Faghih et al., 2024).
Computational costs: Each Arnoldi step involves a matrix-vector product and orthogonalisation (dominating cost for large systems).

Limitations include finite-precision breakdown for very large degrees or insufficiently decaying spectra, potential loss of orthogonality, and (in rational settings) sensitivity to pole placement.

6. Impact and Applications Across Disciplines

Arnoldi-based minimisation methods are widely adopted in scientific computing, optimisation, statistical inference, inverse problems, signal processing, uncertainty quantification, and model reduction. Applications include:

Efficient iterative solvers for very large, sparse, or ill-posed linear and regularised systems (Gazzola et al., 2018, Brezinski et al., 2010).
Acceleration of objective minimisation in large-scale design under noise and uncertainty (Hicken et al., 2015).
Rational or polynomial data fitting and least-squares for high-accuracy approximation in numerical analysis and computational physics (Faghih et al., 2024, Zhang et al., 2024).
Reduced-order modelling for time-dependent Markov processes and chemical reaction networks (Sonnentag et al., 4 Aug 2025).
Regularised solution of PDEs via basis adaptation to non-standard inner products and functional constraints (Zhang et al., 2024).

The unifying feature of Arnoldi-based minimisation is the reformation of diverse but challenging minimisation tasks into well-conditioned, low-dimensional subspace problems, enabled by the algebraic and numerical properties of Krylov sequences and Arnoldi-type recurrences.