Nyström Preconditioner Methods

Updated 16 January 2026

Nyström preconditioner is a spectral technique that uses low-rank approximations to efficiently precondition large, ill-conditioned positive definite matrices.
It accelerates Krylov-subspace methods by reducing condition numbers through scalable strategies like sketching, landmark sampling, and blockwise surrogates.
Its practical applications span PDE solvers, kernel methods, and optimization, offering order-optimal convergence with adaptive rank selection and mixed-precision techniques.

A Nyström preconditioner is a class of spectral preconditioners derived from a low-rank randomized or deterministic Nyström approximation to a positive definite (or semidefinite) matrix or operator. Such preconditioners are designed to accelerate Krylov-subspace methods (e.g., Conjugate Gradient, GMRES), especially for large and ill-conditioned linear systems arising in scientific computing, optimization, and kernel-based learning. Nyström preconditioning exploits scalable low-rank approximations built via sketching, landmark sampling, or blockwise surrogates. It has been rigorously analyzed in the settings of PDE discretizations, kernel matrices, interior-point Newton systems, and general structured or unstructured SPD matrices.

1. Mathematical Principles and Construction

The core of the Nyström preconditioner is the Nyström approximation, a rank- $k$ positive semidefinite surrogate for a matrix $A$ constructed from a sketch matrix (Gaussian, column sampling, or structured random projections):

$\widehat{A} = C W^{\dagger} C^{T}$

where $C = A S$ , $W = S^{T} A S$ , and $S\in\mathbb{R}^{n\times k}$ is the sketching matrix, with $k\ll n$ (Frangella et al., 2021, Garg et al., 21 Jun 2025). For kernel matrices, $S$ may select landmark columns, and $W$ becomes a $k\times k$ submatrix. For operator-access settings, $S$ is a Gaussian or structured random matrix.

Given $\widehat{A}$ , several structured spectral preconditioners may be constructed. A canonical form for $A + \mu I$ is

$P = (\theta + \mu)^{-1} U (\Lambda + \mu I) U^{T} + (I - UU^{T})$

where $U\Lambda U^{T}$ is the (truncated) eigen-decomposition of $\widehat{A}$ and $\theta$ is a reference eigenvalue (often $\lambda_{k}$ , the $k$ -th largest) (Frangella et al., 2021, Carson et al., 2022).

Alternatively, for kernel methods,

$P_{s, q} = I - \sum_{i=1}^{q-1} \left(1-\frac{\lambda_q}{\lambda_i}\right)u_i u_i^T$

where $u_i$ are the computed top- $q$ eigenvectors of the approximation (Abedsoltan et al., 2023).

Computation is efficient: the Nyström approximation needs only $k$ applications of $A$ or $A$ -vector products, plus $O(nk^2+k^3)$ for post-processing, and storage of $O(nk)$ vectors. Application of $P^{-1}$ during each Krylov iteration requires $O(nk)$ arithmetic via the Woodbury identity, avoiding explicit inversion of large matrices (Frangella et al., 2021, Carson et al., 2022, Chu et al., 2024).

2. Spectral and Algorithmic Guarantees

The spectral efficacy of the Nyström preconditioner is tied to the effective dimension $d_{\rm eff}(\mu)=\operatorname{tr}(A(A+\mu I)^{-1})$ . Key theorems establish that, for $\ell \gtrsim d_{\rm eff}(\mu)$ ,

$\mathbb{E}\left[ \kappa_2( P^{-1/2} (A + \mu I) P^{-1/2} ) \right] < C$

for a constant $C$ independent of $n$ , frequently $C=O(10-100)$ , so long as the residual spectrum $\|A-\widehat{A}\|$ is $O(\mu)$ (Frangella et al., 2021, Hong et al., 2024, Chu et al., 2024). This ensures $O(\log(1/\epsilon))$ convergence of PCG.

Adaptive selection of the sketch/rank parameter is possible: e.g., double the rank until the residual or minimum eigenvalue drops below a multiple of $\mu$ , or using power-iteration error estimators to bound $\|A-\widehat{A}\|$ (Frangella et al., 2021). For the two-level Schur complement regime (nested dissection), Nyström–Schur preconditioners can be built so that the effective condition number is $O(\lambda_1/\lambda_{k+1})$ (Daas et al., 2021).

3. Contexts: PDE Solvers, Kernel Methods, and Optimization

Nyström preconditioners have distinct instantiations across disciplines:

PDE time discretizations: The block Kronecker structure of the IRKN (Nyström) method for $u_{tt}=-\mathcal{K}u+g$ leads to block preconditioners of the form $I\otimes M + h_t^2 P\otimes F$ , with $P$ from an LDU-factorization surrogate for the Runge–Kutta Butcher matrix. Such preconditioners are order-optimal: their effectiveness is uniform in timestep $h_t$ and mesh size $h$ (Clines et al., 2022).
Large sparse SPD systems: Nested-dissection/nested Schur complement systems use a Nyström approximation for the (typically dense) Schur block, producing a two-level preconditioner with spectrum tightly clustered about unity. Block Krylov methods are leveraged for batched right-hand sides (Daas et al., 2021).
Regularized kernel machines and machine learning: Nyström-preconditioned CG or gradient methods hasten convergence for kernel ridge regression, covariance estimation, and other machine learning problems. For level- $q$ spectral preconditioning, the Nyström approximate preconditioner $P_{s,q}$ can match the efficacy of exact eigenpreconditioners using only $O(\log(n))$ columns (Abedsoltan et al., 2023). In block variants, recursive blockwise Nyström preconditioning reduces both computational costs and memory footprint, with rigorous spectral approximation theorems (Garg et al., 21 Jun 2025).
Interior-point and convex QP: The randomized Nyström preconditioner makes possible fully matrix-free Newton steps within primal-dual or proximal methods—e.g., Nys-IP-PMM—where each preconditioning step is built using only matvecs with the constraint or Hessian matrices (Chu et al., 2024).

4. Theoretical and Empirical Performance

Rigorous bounds ensure that the condition number of the preconditioned matrix can be made $O(1)$ as soon as the sketch size targets the effective dimension of the tail spectrum,

$\kappa(P^{-1}(A+\mu I)) = O(1)$

for $\ell \gtrsim d_{\rm eff}(\mu)$ . This induces $O(\log(1/\epsilon))$ CG iterations, a dramatic reduction from the $O(\kappa^{1/2})$ scaling of unpreconditioned CG (Frangella et al., 2021, Hong et al., 2024, Abedsoltan et al., 2023).

Empirical results demonstrate substantial reductions in both condition number and iteration counts. For example, in PDE stage systems (e.g., Gauss–Legendre 4-stage IRKN for wave or Klein–Gordon PDEs), condition numbers $\kappa$ and GMRES iteration counts are cut by factors of 2–6 while wall times decrease by similar proportions (Clines et al., 2022). In kernel regression and data analysis, preconditioned CG with $\ell\sim d_{\rm eff}$ achieves 5–10 $\times$ speedups over direct or standard randomized methods (Frangella et al., 2021, Abedsoltan et al., 2023). Multi-level and block recursive variants further reduce wall time for heavy-tailed spectra and high-rank regimes (Garg et al., 21 Jun 2025).

5. Extensions: Block, Mixed-Precision, and Two-Level Preconditioners

Several advanced Nyström preconditioning strategies have been proposed:

Block-Nyström preconditioners recursively partition the landmark set, forming several small low-rank surrogates and aggregating them. For spectral tails with heavy decay, this reduces both computation and memory while maintaining spectral guarantees (Garg et al., 21 Jun 2025).
Mixed-precision Nyström preconditioning: Single-pass, mixed-precision algorithms compute the (dominant) matrix–sketch product $A\Omega$ at low precision and carry out subsequent steps in higher precision. The rounding error analysis provides prescriptive criteria for selecting sketch size and allowable precisions without degrading condition number (Carson et al., 2022).
Two-level (Schur complement) and factorized Nyström preconditioners: Hierarchical solvers use a Nyström approximation for the Schur complement or for factorizations within adaptive/structured preconditioners, enabling scalability beyond routine setting. Adaptive variants balance the cost/accuracy tradeoff according to empirical or spectral diagnostics (Daas et al., 2021, Zhao et al., 2023).

6. Applications and Practical Considerations

Nyström preconditioners are especially advantageous in settings where the matrix is never formed explicitly or is too large for direct factorization. They are used extensively in:

Model-based iterative reconstruction in imaging and tomography, where forward operators are only available as oracles and operator-only access algorithms are necessary (Hong et al., 2024).
Regularized kernel learning, where only a small sample of columns suffices to approximate the spectrum (Abedsoltan et al., 2023).
Large-scale quadratic programming, robust optimization, and Hessian-based solvers, especially in interior-point and iterative-proximal frameworks (Chu et al., 2024).
All-at-once time discretizations for parabolic/hyperbolic PDEs, using block Kronecker and Toeplitz-like algebraic structure (Clines et al., 2022, Liu et al., 2021).

Tuning parameters such as sketch size $\ell$ , regularization $\mu$ , and landmark subset selection are guided by effective dimension estimates or adaptive error estimators. When used with GPU or multi-core CPU architectures, on-the-fly Nyström preconditioner computation can be performed in each solver iteration, supporting dynamic or per-iteration reweighting (Hong et al., 2024).

7. Limitations and Open Directions

Nyström preconditioners are fundamentally controlled by the spectral decay of the underlying operator. For matrices with slowly decaying or flat spectra (i.e., large numerical rank), the sketch size and associated computational costs can rise substantially (Garg et al., 21 Jun 2025, Zhao et al., 2023). Extensions such as adaptive factorized or block recursive variants are designed to address these scenarios.

A key limitation in some settings is that positive definiteness (or at least the absence of significant negative or indefinite spectrum) is often assumed, and certain formulations neglect advective or non-self-adjoint terms (Clines et al., 2022). Further development of robust, spectrum-adaptive Nyström preconditioners—potentially leveraging structured matrix classes, parallel block solve strategies, or non-L2 divergence-based approximations—remains an open area of research.

References:

"Efficient Order-Optimal Preconditioners for Implicit Runge-Kutta and Runge-Kutta-Nyström Methods Applicable to a Large Class of Parabolic and Hyperbolic PDEs" (Clines et al., 2022)
"Randomized Nyström Preconditioning" (Frangella et al., 2021)
"Faster Low-Rank Approximation and Kernel Ridge Regression via the Block-Nyström Method" (Garg et al., 21 Jun 2025)
"Preconditioner Design via the Bregman Divergence" (Bock et al., 2023)
"On Adapting Randomized Nyström Preconditioners to Accelerate Variational Image Reconstruction" (Hong et al., 2024)
"Two-level Nyström--Schur preconditioner for sparse symmetric positive definite matrices" (Daas et al., 2021)
"Preconditioning without a preconditioner: faster ridge-regression and Gaussian sampling with randomized block Krylov subspace methods" (Chen et al., 30 Jan 2025)
"Single-pass Nyström approximation in mixed precision" (Carson et al., 2022)
"Parallel-in-time preconditioners for the Sinc-Nyström method" (Liu et al., 2021)
"An Adaptive Factorized Nyström Preconditioner for Regularized Kernel Matrices" (Zhao et al., 2023)
"On the Nystrom Approximation for Preconditioning in Kernel Machines" (Abedsoltan et al., 2023)
"Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning" (Dereziński et al., 2024)
"Randomized Nyström Preconditioned Interior Point-Proximal Method of Multipliers" (Chu et al., 2024)

Markdown Upgrade to Chat

References (13)

Randomized Nyström Preconditioning (2021)

Faster Low-Rank Approximation and Kernel Ridge Regression via the Block-Nyström Method (2025)

Single-pass Nyström approximation in mixed precision (2022)

On the Nystrom Approximation for Preconditioning in Kernel Machines (2023)

Randomized Nyström Preconditioned Interior Point-Proximal Method of Multipliers (2024)

On Adapting Randomized Nyström Preconditioners to Accelerate Variational Image Reconstruction (2024)

Two-level Nyström--Schur preconditioner for sparse symmetric positive definite matrices (2021)

Efficient Order-Optimal Preconditioners for Implicit Runge-Kutta and Runge-Kutta-Nyström Methods Applicable to a Large Class of Parabolic and Hyperbolic PDEs (2022)

An Adaptive Factorized Nyström Preconditioner for Regularized Kernel Matrices (2023)

10.

Parallel-in-time preconditioners for the Sinc-Nyström method (2021)

11.

Preconditioner Design via the Bregman Divergence (2023)

12.

Preconditioning without a preconditioner: faster ridge-regression and Gaussian sampling with randomized block Krylov subspace methods (2025)

13.

Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Nyström Preconditioner.

Nyström Preconditioner Methods

1. Mathematical Principles and Construction

2. Spectral and Algorithmic Guarantees

3. Contexts: PDE Solvers, Kernel Methods, and Optimization

4. Theoretical and Empirical Performance

5. Extensions: Block, Mixed-Precision, and Two-Level Preconditioners

6. Applications and Practical Considerations

7. Limitations and Open Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

Nyström Preconditioner Methods

1. Mathematical Principles and Construction

2. Spectral and Algorithmic Guarantees

3. Contexts: PDE Solvers, Kernel Methods, and Optimization

4. Theoretical and Empirical Performance

5. Extensions: Block, Mixed-Precision, and Two-Level Preconditioners

6. Applications and Practical Considerations

7. Limitations and Open Directions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research