Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 82 tok/s

Gemini 2.5 Pro 61 tok/s Pro

GPT-5 Medium 35 tok/s Pro

GPT-5 High 36 tok/s Pro

GPT-4o 129 tok/s Pro

Kimi K2 212 tok/s Pro

GPT OSS 120B 474 tok/s Pro

Claude Sonnet 4.5 37 tok/s Pro

2000 character limit reached

Krylov Solvers

Updated 1 July 2025

Krylov solvers are iterative algorithms that approximate solutions to linear systems and eigenvalue problems by working within Krylov subspaces generated by matrix-vector products.
These methods are essential for efficiently solving large and sparse linear systems across computational science, engineering, and data analysis domains.
Modern Krylov solvers leverage essential techniques like preconditioning, parallelization, and advanced algorithmic enhancements for improved robustness and performance.

Krylov solvers are a class of iterative algorithms for solving linear systems, eigenvalue problems, and related tasks that exploit the subspace generated by repeated applications of a matrix to a vector. Fundamentally, these methods construct approximate solutions within the so-called Krylov subspace, allowing efficient computation in high-dimensional settings, especially for large, sparse, or structured systems. Krylov subspace approaches underpin many of the most widely used numerical methods in computational science, engineering, and data analysis.

1. Mathematical Principles and Core Algorithms

Krylov subspace methods generate a sequence of approximations to the solution of a linear system $Ax = b$ (or related spectral problems) using subspaces of the form

$\mathcal{K}_m(A, r_0) = \mathrm{span}\{ r_0, Ar_0, A^2r_0, \ldots, A^{m-1} r_0 \},$

where $r_0 = b - Ax_0$ is the initial residual. Each algorithm expands this basis in a manner appropriate to the matrix's properties and the problem's requirements.

Prominent Krylov algorithms include:

Conjugate Gradient (CG): For symmetric positive definite (SPD) matrices, generating iterates via short recurrences and minimizing the error norm.
GMRES (Generalized Minimal Residual): For general nonsymmetric matrices, minimizing the residual over the full Krylov space via Arnoldi iteration, but requiring storage of all basis vectors.
MINRES and SYMMLQ: For symmetric indefinite systems.
BiCGSTAB, CGS, and QMR: For nonsymmetric systems, employing biorthogonalization with short recurrences.

These methods are often adaptable to preconditioning, acceleration, and block or flexible (varying preconditioner) variants, broadening their applicability.

2. Preconditioning and Scalability

Preconditioning is essential for Krylov methods when the system is ill-conditioned or when rapid convergence is required. The central idea is to transform the original system into one with more favorable spectral properties.

Conventional Preconditioners: Jacobi (diagonal), symmetric successive overrelaxation (SSOR), and incomplete LU/Cholesky factorizations are classic choices (Srinivasan et al., 2014, Cui et al., 2016).
Flexible Preconditioning: Algorithms such as FGMRES allow the preconditioner to change between iterations, supporting inexact or variable preconditioning (e.g., where the preconditioner itself is solved only approximately in each iteration) (Srinivasan et al., 2014).
Inner-Iteration Preconditioning: Short stationary iterative sweeps (e.g., SOR, SSOR) are used as inexpensive preconditioners within each Krylov iteration, which is beneficial for highly ill-conditioned systems and can be applied matrix-free (Cui et al., 2016).

Scalability considerations include:

Exploitation of fast matrix-vector products (e.g., in kernel methods) to maintain $O(N \log N)$ or $O(N^2/p)$ computational scaling (Srinivasan et al., 2014).
Matrix-free implementations to avoid constructing or storing large dense preconditioners in applications such as kernel regression and large linear programming (Srinivasan et al., 2014, Cui et al., 2016).
Use of domain decomposition, overlapping Schwarz methods, and communication-efficient matrix formats to extend Krylov methods to large parallel or GPU-based environments (Yang et al., 2016).

3. Algorithmic Enhancements, Deflation, and Flexible Strategies

Modern Krylov approaches feature several enhancements designed to improve robustness, memory efficiency, and parallel scalability:

Deflation/Restarting: Recycling or deflating Krylov subspace components associated with slow-to-converge spectral elements (e.g., harmonic Ritz vectors) restores superlinear convergence in restarted GMRES and flexible GMRES variants; this is especially useful in high-order CFD, optimization, and ill-conditioned regimes (Jadoui et al., 27 Apr 2024).
Inner-Outer (Nested) Krylov Solvers: The outer Krylov method is preconditioned with an inner Krylov solver, permitting strong preconditioning without constructing explicit preconditioners and allowing the use of iterative solves as preconditioners (Srinivasan et al., 2014, Jadoui et al., 27 Apr 2024).
Flexible and Adaptive Tolerances: Tuning inner solver tolerances and preconditioner parameters (e.g., in flexible preconditioning or truncated inner solves) provides control over computational cost and convergence behavior (Srinivasan et al., 2014, Jadoui et al., 27 Apr 2024).
Preconditioning in Energy or Special Problem Structure: For transport, multigrid-in-energy preconditioning addresses bottlenecks specific to discretization structure, allowing scaling to extreme problem sizes (Slaybaugh et al., 2016).

4. High-Performance and Parallel Implementations

Krylov methods are amenable to acceleration on modern hardware including multi-core CPUs and GPUs:

Matrix Format and Kernel Optimization: Efficient sparse matrix-vector multiplication (SpMV)—often the most expensive operation—is optimized through hybrid formats (HEC: ELL+CSR) and coalesced memory access on GPUs (Yang et al., 2016).
Partitioning and Overlap: Partitioning matrices (by METIS, for example) and introducing overlap regions minimize inter-GPU communication. Host-managed caches for inter-device vector communication support scaling to multiple accelerators (Yang et al., 2016).
Parallel Preconditioner Solvers: Highly parallel triangular solvers using level scheduling improve the throughput of ILU and block-preconditioned steps (Yang et al., 2016).
Performance Tradeoffs: GPU and parallel acceleration yield speedups ranging from $10\times$ to $28\times$ versus CPU baselines, with a trade-off between preconditioner strength (and hence convergence) and raw speed; more parallel-friendly preconditioners may require more iterations but exploit hardware better (Yang et al., 2016).

5. Practical Applications and Empirical Performance

Krylov solvers are central to simulation and optimization across scientific and engineering domains:

Kernel Regression and Kriging: Preconditioned, matrix-free flexible Krylov approaches enable scalable Gaussian process regression and spatial kriging, achieving up to $3-5\times$ speedup and orders of magnitude reduction in iterations for problems with $N \sim 10^5$ – $10^6$ (Srinivasan et al., 2014).
Linear Programming and Optimization: Inner-iteration preconditioned Krylov solvers outperform classical (e.g., SeDuMi, SDPT3) and even advanced direct solvers on large and ill-conditioned LPs, demonstrating resilience to rank-deficiency and attaining competitive or superior robustness and efficiency (Cui et al., 2016).
High-Performance Solvers on GPUs: BiCGSTAB, GMRES, CG (with preconditioning) and algebraic multigrid (AMG) methods implemented on GPUs deliver order-of-magnitude acceleration for physical simulation and PDE discretizations, with the ability to scale to billions of unknowns (Yang et al., 2016).
Preconditioning in Energy for Radiation Transport: Multigrid-in-energy preconditioning for Krylov solvers in neutron transport codes dramatically reduces iteration counts and supports scaling to hundreds of thousands of cores, facilitating new scientific calculations (Slaybaugh et al., 2016).

Table: Krylov Preconditioners for Large-Scale Problems

Application	Preconditioning Strategy	Performance/Scalability Outcome
Kernel Regression	Regularized kernel copy, FGMRES	3–5× faster, scales to $N>10^5$ , little extra memory
LP/Optimization	Stationary inner iteration, AB-GMRES/MRNE	More robust than direct/iterative baselines; solves ill-conditioned/rank-deficient systems
PDEs on GPUs	ILU(k), RAS, triangular parallel	BiCGSTAB/GMRES up to $28\times$ speedup; AMG 2–10×
Neutron Transport	Multigrid in energy, right-preconditioning	2–10× fewer Krylov iterations, linear scaling in energy sets

6. Implementation Considerations and Best Practices

Effective use of Krylov solvers in large-scale and ill-conditioned problems involves:

Matrix-free implementations whenever possible, especially where fast matrix-vector products are available.
Choosing preconditioner parameters (e.g., regularizer $\delta$ , truncation tolerance $\epsilon$ ) based on empirical balance between preconditioner cost vs. improved convergence (Srinivasan et al., 2014).
Dynamically adjusting solver and preconditioner tolerances to match the current stage of nonlinear or optimization iteration (Cui et al., 2016).
Leveraging hardware-specific matrix formats, kernel optimizations, and load-balanced communication for high-end parallel execution (Yang et al., 2016, Slaybaugh et al., 2016).
Employing deflation, flexible, or recycling strategies for long-horizon or repeatedly solved systems (e.g., time-dependent, parametric, or optimization inner loops).

In summary, Krylov solvers constitute a versatile class of algorithms whose modern incarnations leverage flexible preconditioning, hardware acceleration, and advanced algorithmic techniques to efficiently solve large, challenging linear systems in scientific and data-intensive applications. Their continued development and effective tuning remain at the core of scalable computational mathematics.