Stochastic Lanczos Quadrature (SLQ)

Updated 16 January 2026

Stochastic Lanczos Quadrature (SLQ) is a randomized, matrix-free method that estimates spectral sums like log-determinants by combining random probing with Lanczos quadrature.
It employs short Krylov subspace recurrences to generate accurate quadrature nodes and weights, reducing the need for full matrix eigendecomposition.
Enhancements such as block-probe and variance reduction techniques extend SLQ’s practical application in scalable machine learning and spectral density estimation.

Stochastic Lanczos Quadrature (SLQ) is a randomized, matrix-free algorithm for approximating spectral sums and the spectral density of large Hermitian matrices, specifically targeting trace expressions of the form $\mathrm{Tr}\,f(A)$ , where $A$ is typically accessed only through matrix–vector products. In SLQ, randomized trace estimators are combined with numerical quadrature rules derived from short Krylov subspace (Lanczos) recurrences, producing accurate, high-confidence approximations for quantities such as the log-determinant, spectral measures, and spectral densities—without explicit formation or eigendecomposition of $A$ .

1. Algorithmic Framework and Computational Core

The SLQ method combines stochastic trace estimation with Gaussian quadrature arising from the Lanczos process. For a given Hermitian $A\in\mathbb{R}^{n\times n}$ and analytic $f$ , one seeks to approximate $\mathrm{Tr}\,f(A)$ efficiently:

Random Probing: $N$ independent random probe vectors $v^{(i)}$ (unit-norm, often Rademacher or Gaussian) are generated. For any quadratic form,

$\mathbb{E}[v^T f(A) v] = \frac{1}{n} \mathrm{Tr}\,f(A).$

Lanczos Quadrature: For each probe $v^{(i)}$ , run $A$ 0 steps of Lanczos starting from $A$ 1 to build the tridiagonal $A$ 2. Diagonalizing $A$ 3 yields quadrature nodes $A$ 4 and weights $A$ 5 (with $A$ 6 the eigenvectors of $A$ 7).
Spectral Trace Approximation: The $A$ 8-th probe estimates $A$ 9. The overall SLQ estimate is then

$A$ 0

For $A$ 1, simply set $A$ 2 (Li et al., 2023, Chen et al., 2022).

The cost is dominated by $A$ 3 sets of $A$ 4 matrix–vector multiplications, making the total MVM count $A$ 5, with $A$ 6 in most practical regimes.

2. Error Analysis, Node Asymmetry, and Complexity Guarantees

Error in SLQ arises from two distinct sources: quadrature (Lanczos) error and stochastic (Monte Carlo probe) error.

Quadrature Error for Asymmetric Nodes: The error bound for $A$ 7-point Lanczos quadrature with potentially asymmetric nodes is

$A$ 8

where $A$ 9, and $A\in\mathbb{R}^{n\times n}$ 0 is the Bernstein ellipse containing the spectrum of $A\in\mathbb{R}^{n\times n}$ 1 (Li et al., 2023).

Combined Probabilistic SLQ Bounds: For $A\in\mathbb{R}^{n\times n}$ 2 with spectrum in $A\in\mathbb{R}^{n\times n}$ 3, fixing accuracy $A\in\mathbb{R}^{n\times n}$ 4 and failure probability $A\in\mathbb{R}^{n\times n}$ 5, explicit formulas for $A\in\mathbb{R}^{n\times n}$ 6 and $A\in\mathbb{R}^{n\times n}$ 7 provide

$A\in\mathbb{R}^{n\times n}$ 8

with $A\in\mathbb{R}^{n\times n}$ 9, $f$ 0 when using the optimal split of error budget between quadrature and stochastic terms (Li et al., 2023, Chen et al., 2022, Chen et al., 2021).

Symmetric vs. Asymmetric Nodes: Classical analyses assumed symmetry of the Lanczos quadrature nodes (valid for constant-diagonal tridiagonals), yielding slightly tighter bounds. However, this property fails generically—realistic $f$ 1 typically yields asymmetric Ritz values—and worst-case guarantees must use the more general asymmetric-node rate. Asymmetric bounds have a denominator $f$ 2 versus $f$ 3 for the symmetric case, resulting in slightly looser but universally applicable guarantees (Li et al., 2023).
Complexity: For analytic $f$ 4 and fixed $f$ 5, the total number of mat–vecs is typically $f$ 6, which is near-linear in $f$ 7 since $f$ 8 and $f$ 9 for practical values of $\mathrm{Tr}\,f(A)$ 0. This achieves high-probability accuracy with respect to both absolute and relative error criteria (Li et al., 2023, Chen et al., 2021).

3. Extensions, Deflation, and Variance Reduction

Several recent advances refine the SLQ methodology:

Block-Probe SLQ / BOLT: Extending SLQ to use orthonormal block-probes (BOLT algorithm), block Lanczos iteration, and block quadrature increases efficiency, especially in near-flat-spectrum regimes. For a fixed total MVM budget, block SLQ yields $\mathrm{Tr}\,f(A)$ 1 error compared to $\mathrm{Tr}\,f(A)$ 2 for classical SLQ, matching the optimal rate for unbiased trace estimation (Yeon et al., 18 May 2025).
Variance-Reduced SLQ: By combining PCPS-style projection subspaces and Hutchinson estimators on the residual, one can decrease stochastic variance and accelerate convergence, especially for log-determinant estimation (Han et al., 2023).
Adaptive One-Probe and "Log-Det-ective" Strategies: Applying Nyström or similar low-rank preconditioners before SLQ can, in regimes of rapid spectral decay, enable high-accuracy log-determinant estimation with a single Gaussian probe, with variance bounded by the tail of the spectrum. Adaptive algorithms can cheaply certify when more SLQ probes are justified (Cortinovis et al., 9 Jan 2026).
Implicit Deflation via Krylov Subspaces: Even without explicit eigenpair removal, the single-vector Krylov subspace generated in SLQ aligns rapidly with dominant eigenspaces, achieving low-rank approximation and yielding error bounds in $\mathrm{Tr}\,f(A)$ 3 (Wasserstein) distance that scale with the singular value tail $\mathrm{Tr}\,f(A)$ 4, provided $\mathrm{Tr}\,f(A)$ 5 (Bhattacharjee et al., 2024).

4. Spectral Measure, Spectral Density, and Applications

SLQ provides not just single spectral traces, but strong uniform-in- $\mathrm{Tr}\,f(A)$ 6 guarantees for cumulative empirical spectral measure (CESM) and for spectral densities:

Spectral Measure Approximation: SLQ rapidly approximates the CESM with Wasserstein error scaling as $\mathrm{Tr}\,f(A)$ 7 for $\mathrm{Tr}\,f(A)$ 8 Lanczos steps, achieving probabilistic control over the entire spectral distribution (Chen et al., 2021).
Spectral Density: For applications demanding the density of states (DOS), SLQ's direct sum-of-delta-masses output can be convolved to obtain a smooth spectral density, with empirical performance surpassing Chebyshev-based kernel polynomial methods (KPM) in Wasserstein distance under rapid spectral decay or presence of spikes/gaps in spectrum (Bhattacharjee et al., 2024, Chen et al., 2022).
Log Determinant and Trace Estimation: SLQ is widely used in machine learning and statistics for scalable approximation of $\mathrm{Tr}\,f(A)$ 9 for kernel matrices, computation of Kullback–Leibler divergence, free energy, and Hessian spectrum analysis.
Proxy KL and Wasserstein Estimators under Partial Access: Subblock SLQ and BOLT approaches can yield unbiased estimators for KL divergence and $N$ 0 distance between Gaussians, even with only partial access to $N$ 1 (e.g., reading principal minors) (Yeon et al., 18 May 2025).

SLQ and kernel polynomial methods (KPM) are leading randomized matrix-free quadrature techniques, but exhibit distinct practical properties:

Convergence Rate: SLQ (Gauss quadrature) and KPM (Jackson-damped Chebyshev) both achieve exponential decay in $N$ 2 for analytic $N$ 3, but SLQ’s adaptive node placement enables much faster convergence in spectra with large clusters or gaps.
Spectral Adaptivity: SLQ's quadrature nodes adapt automatically to the spectrum, concentrating quadrature near eigenclusters and away from gaps or outliers, whereas KPM distributes nodes uniformly after rescaling and requires explicit tuning for concentrated spectra.
Implementation Complexity: SLQ's implementation requires only Lanczos and small matrix diagonalizations per probe, whereas KPM necessitates Chebyshev moment computation and post-processing for density smoothing.
Recommended Use: SLQ is generally advantageous when only a small number of spectral traces are needed and accuracy is prioritized, while KPM remains competitive when smooth spectral densities are required over large intervals (Chen et al., 2022).
Empirical Performance: On a range of spectra (uniform, gapped, low-rank, real-world graphs), SLQ and its block/projection variants achieve or exceed the performance of KPM and classical randomized SVD trace estimators, with optimal scaling for spectral sum accuracy (Yeon et al., 18 May 2025, Han et al., 2023, Bhattacharjee et al., 2024).

6. Parameter Selection and Practical Guidelines

Parameter choice in SLQ (number of probes $N$ 4, Lanczos degree $N$ 5):

Error Tolerance: For analytic $N$ 6, achieve $N$ 7-accuracy with $N$ 8 and $N$ 9.
Spectral Range Estimation: Estimate $v^{(i)}$ 0 and $v^{(i)}$ 1 (power/Lanczos iteration) to compute the required quadrature ellipse parameters for error bounds.
Optimal Error Budget Split: Utilize non-uniform allocation of the error budget between quadrature and stochastic errors to minimize total MVMs, solving the transcendental minimization for $v^{(i)}$ 2 when necessary (Li et al., 2023).
Block- and Subblock-Extensions: Use block probes and subblock schemes for highly parallelizable, unbiased trace estimation under memory or access constraints, or to obtain monotonic convergence in partial-access regimes (Yeon et al., 18 May 2025, Bhattacharjee et al., 2024).
Preconditioning and Projection: For matrices with fast singular value decay, preconditioned SLQ—leveraging low-rank sketch-based or Nyström approximations—allows drastic probe reduction and high accuracy, particularly for log-determinant problems (Cortinovis et al., 9 Jan 2026, Han et al., 2023).

7. Numerical Results and Empirical Observations

Empirical studies illustrate sharp convergence of SLQ and its variants:

Spectral Convergence: Error in approximating DOS, CESM, and trace decays rapidly with Lanczos degree; presence of spectral gaps accelerates convergence.
One-sample Preconditioned SLQ: In many SPD matrices with moderate spectral decay, a single Gaussian probe combined with Nyström preconditioning suffices to yield accurate log-determinant estimates with negligible bias (Cortinovis et al., 9 Jan 2026).
Variance Reduction and Subspace Approaches: Projection-based variants offer 30–50% MVM savings over vanilla SLQ, with conservative but reliable probabilistic error estimates (Han et al., 2023).
Block- and Subblock Variants for Partial-Access: BOLT and subblock SLQ approaches maintain unbiasedness and accuracy, outperforming Hutch++ and traditional methods in flat-spectrum or principal submatrix-only regimes (Yeon et al., 18 May 2025).
Superiority over KPM: On standard test cases (e.g., Heisenberg spin chain, kernel matrices), SLQ avoids typical KPM artifacts (Gibbs ripples), and VR-SLQ achieves leading-order error with negligible weight-variance corrections (Bhattacharjee et al., 2024, Chen et al., 2022).

In summary, SLQ and its modern extensions form a principled, flexible, and highly efficient toolkit for spectral sum and density approximation of large Hermitian matrices in both full and partial-access computational environments, supported by rigorous probabilistic error guarantees and strong empirical validation (Li et al., 2023, Chen et al., 2022, Bhattacharjee et al., 2024, Yeon et al., 18 May 2025, Cortinovis et al., 9 Jan 2026, Han et al., 2023, Chen et al., 2021).