Arnoldi-Based Minimisation Method
- Arnoldi-based minimisation method is a Krylov subspace technique that constructs an orthonormal basis to transform complex minimisation problems into well-conditioned, smaller projected systems.
- It leverages Arnoldi iterations to reduce high-dimensional linear and inverse problems, facilitating efficient and stable least-squares and regularised solutions.
- The method enables robust optimisation under noisy and ill-posed conditions, with applications in data fitting, model reduction, and stochastic optimisation.
The Arnoldi-based minimisation method encompasses a spectrum of techniques exploiting Arnoldi's process within Krylov subspace frameworks for solving minimisation problems associated with linear systems, rational matrix functions, ill-posed inverse problems, high-dimensional optimisation, least-squares fitting, and model reduction. Central to all such methods is the use of the Arnoldi iteration to construct an orthonormal basis for Krylov subspaces, providing a well-conditioned coordinate system for efficient, reliable subspace-projected minimisation. These techniques have demonstrated significant efficacy for large-scale, high-dimensional, and/or ill-conditioned problems, especially where explicit operators (e.g., Hessians) are unavailable, or in the presence of noise and numerical instability.
1. The Arnoldi Process: Generation of Krylov Subspaces
The Arnoldi process generates an orthonormal basis for the Krylov subspace associated with a matrix and a starting vector (Gazzola et al., 2018). At each iteration ,
The process yields an orthonormal basis and an upper Hessenberg matrix satisfying .
Extensions exist for functionals of matrices (e.g., rational functions, shifted-inverse forms), for approximating matrix exponentials, and for constructing rational Krylov spaces when denominator structures or regularization are imposed (Chen et al., 2023, Brezinski et al., 2010).
2. Arnoldi-Based Minimisation Frameworks
2.1 Linear and Regularised Least-Squares
For linear systems , the GMRES algorithm seeks minimising (Gazzola et al., 2018), which reduces to a small least-squares problem over using the Arnoldi basis: Arnoldi-based variants regularise the subspace problem, for example:
- Arnoldi–Tikhonov: .
- Arnoldi–TSVD: Truncates small singular values of to stabilise solutions to ill-posed problems.
2.2 Rational Function Approximations and Matrix Functions
Given a rational function and matrix , one seeks minimising in the -norm (Chen et al., 2023). The Arnoldi-OR method extends Arnoldi for this setting, requiring extra steps and solving a least-squares problem. The process uses only small blocks of Hessenberg matrices, avoiding explicit evaluation of high-degree matrix polynomials. For , Arnoldi-OR reduces to GMRES.
Rational Arnoldi methods (RA) recast solutions of regularised systems or matrix equations as , with a shifted-inverse or regularised operator, and a rational or matrix function; (Brezinski et al., 2010).
2.3 High-Dimensional Optimisation Under Noisy Data
The Stochastic Arnoldi Method (SAM) addresses optimisation of in high-dimensional, noisy settings (Hicken et al., 2015). Arnoldi sampling, via finite-difference Hessian-vector approximations of noisy gradients, constructs a low-rank approximation of the Hessian in the subspace spanned by dominant Ritz vectors. A quadratic trust-region model is built in this subspace, using either step-averaged or directional-derivative-based linear parts, and trust-region subspace methods (e.g., Moré–Sorensen) are employed to update iterates.
2.4 Polynomial and Rational Least-Squares, Structured Fitting
In polynomial, rational, and Sobolev least-squares problems, Arnoldi orthogonalization renders the ill-conditioned Vandermonde (or Cauchy) coefficient matrix as a Krylov subspace basis, reducing the overdetermined system
to a small, well-conditioned upper-triangular (or block pentadiagonal) system (Faghih et al., 2024). For multivariate polynomial and Hermite least-squares, the G-Arnoldi method constructs a -orthonormal basis appropriate for weighted and application-driven inner products (Zhang et al., 2024).
2.5 Aggregation and Model Reduction for Markov Chains
Arnoldi aggregation projects the evolution of a Markov chain, defined by , onto the Krylov span generated by and , providing a low-dimensional system governing the projected dynamics. The approximation is minimal, provably exact when an invariant subspace is reached, and allows for sharp, residual-based error estimation (Sonnentag et al., 4 Aug 2025).
3. Key Algorithmic and Structural Elements
Arnoldi-based minimisation methods share several algorithmic features, often tailored for specific problem classes:
- Orthonormal Basis Construction: Modified Gram–Schmidt, with application-aware inner products (e.g., -inner products for PDEs or data weighting).
- Small Projected Systems: The main minimisation problem in the full space is projected to a low-dimensional system (Hessenberg, block, or banded), exploiting the structure of and (or pencils in rational or confluent cases).
- Implicit Function Action: Matrix functions , rational functions, or exponentials are applied in the reduced subspace via , circumventing the need for full-matrix computations.
- Finite-Difference/Subsampled Approximations: For non-explicit operators (e.g., Hessian in optimisation), sampled Hessian-vector products or directional derivatives are used.
- Short Recurrences: For least-squares and fitting, Arnoldi-generated discrete orthogonal basis functions satisfy explicit recurrences, accelerating evaluation and stability (Zhang et al., 2024).
- Residual and Stopping Criteria: Residuals in the last Arnoldi direction or reductions in model error (e.g., trust-region model vs. actual function decrement) provide natural stopping rules and adaptivity (Sonnentag et al., 4 Aug 2025, Hicken et al., 2015).
4. Theoretical Properties and Numerical Behavior
Table: Principal Theoretical Features and Classes
| Class | Optimality/Exactness | Conditioning | Error Analysis |
|---|---|---|---|
| GMRES / Linear System (Gazzola et al., 2018) | -minimal residual in subspace | Improved by preconditioning | Spectrum, normality, and projections |
| Arnoldi-OR / Rational Function (Chen et al., 2023) | -minimal residual | Avoids explicit , uses residual norm | Crouzeix–Palencia, Ritz values |
| SAM / Stochastic Optimisation (Hicken et al., 2015) | Trust-region quadratic in dominant subspace | Low-rank, robust to noise | Bounded/truncated noise, empirical |
| Polynomial/Rational LS (Faghih et al., 2024) | LS-optimal in polynomial/rational Krylov | Dramatically stabilised vs. Vandermonde | Orthogonal polynomials/rationals |
| Arnoldi Aggregation / Markov (Sonnentag et al., 4 Aug 2025) | Provably exact on invariant subspace | Minimal dimension for accurate expansion | Residual bounds, superlinear decay |
For linear problems, minimisation in the Arnoldi Krylov subspace is classically optimal with respect to the chosen norm, with small projected systems advantageous for both accuracy and efficiency. Preconditioning and regularisation within the Arnoldi framework have proven effective at mitigating ill-posedness and semi-convergence (Gazzola et al., 2018, Brezinski et al., 2010). For rational approximations and matrix functions, sharp spectral-set bounds and the limitations of eigenvalue-based heuristics are established (Chen et al., 2023).
In stochastic or high-dimensional optimisation, the robustness of the model to noise is contingent on spectral decay of the Hessian and subspace parameter choices (Hicken et al., 2015). Empirically, Arnoldi-based minimisation demonstrates accelerated reduction of error and residual norm relative to naive approaches, particularly in state-space reduction and data-fitting settings (Sonnentag et al., 4 Aug 2025, Faghih et al., 2024, Zhang et al., 2024).
5. Parameter Selection, Limitations, and Practical Guidelines
Parameter choices are application-dependent but the following guidelines emerge across classes:
- Arnoldi subspace dimension (, ): 4–10 (optimisation), $2$–$3$ times larger for stable eigen-estimates (spectrum estimation for Hessian), or until diminishing returns in model error/residual.
- Sampling radius in optimisation (): Chosen to match expected trust-region scale or expected noise level (Hicken et al., 2015).
- Preconditioning: Application-specific; circulant or low-rank operators can be derived from initial Arnoldi samples (Gazzola et al., 2018).
- Orthogonality and Stability: Re-orthogonalisation (“twice is enough”) can be employed in polynomial bases at the cost of increased flops (Faghih et al., 2024).
- Choice of weights and poles (rational/polynomial LS): In weighted least-squares, quadrature-based weights enhance accuracy; adaptive pole selection for rational Arnoldi remains an open research direction (Faghih et al., 2024).
- Computational costs: Each Arnoldi step involves a matrix-vector product and orthogonalisation (dominating cost for large systems).
Limitations include finite-precision breakdown for very large degrees or insufficiently decaying spectra, potential loss of orthogonality, and (in rational settings) sensitivity to pole placement.
6. Impact and Applications Across Disciplines
Arnoldi-based minimisation methods are widely adopted in scientific computing, optimisation, statistical inference, inverse problems, signal processing, uncertainty quantification, and model reduction. Applications include:
- Efficient iterative solvers for very large, sparse, or ill-posed linear and regularised systems (Gazzola et al., 2018, Brezinski et al., 2010).
- Acceleration of objective minimisation in large-scale design under noise and uncertainty (Hicken et al., 2015).
- Rational or polynomial data fitting and least-squares for high-accuracy approximation in numerical analysis and computational physics (Faghih et al., 2024, Zhang et al., 2024).
- Reduced-order modelling for time-dependent Markov processes and chemical reaction networks (Sonnentag et al., 4 Aug 2025).
- Regularised solution of PDEs via basis adaptation to non-standard inner products and functional constraints (Zhang et al., 2024).
The unifying feature of Arnoldi-based minimisation is the reformation of diverse but challenging minimisation tasks into well-conditioned, low-dimensional subspace problems, enabled by the algebraic and numerical properties of Krylov sequences and Arnoldi-type recurrences.