Papers
Topics
Authors
Recent
Search
2000 character limit reached

Block-Orthonormal Lanczos Quadrature (BOLT)

Updated 22 June 2026
  • BOLT is a Krylov-based method that generalizes scalar Lanczos quadrature to block settings, enabling efficient approximation of spectral functionals.
  • It utilizes block Gauss and Gauss-Radau quadrature with proven monotonic error bounds for robust trace and transfer function estimation.
  • The method supports stochastic trace estimation and adaptive low-rank corrections, making it ideal for applications in PDEs, kernel methods, and covariance analysis.

Block-Orthonormal Lanczos Quadrature (BOLT) is a Krylov-based methodology for approximating quantities of the form BTϕ(A)BB^T \phi(A) B and related spectral functionals, where AA is a large symmetric positive definite (s.p.d.) matrix, BB is a tall matrix of orthonormal columns, and ϕ\phi is a matrix function analytic near the spectrum of AA. BOLT generalizes the classical (scalar) orthonormal Lanczos quadrature to the block setting, enabling efficient, robust, and highly accurate computation of trace estimates, transfer functions, and related quantities in large-scale settings such as MIMO PDE discretizations, kernel methods, and covariance estimation. The method unifies block Krylov Lanczos recurrences, block Gauss- and Gauss-Radau quadrature, and Stieltjes/Hermite–Padé approximation theory, supporting error analysis, extrapolation, and extensions for partial-matrix access and stochastic estimation.

1. Block Lanczos Framework and S-Fraction Representation

BOLT leverages the construction of a block Krylov subspace

Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}

where ARn×nA \in \mathbb{R}^{n\times n} is s.p.d. and BRn×pB \in \mathbb{R}^{n\times p} has full column rank (pnp \ll n) and BTB=IpB^TB = I_p by assumption. An orthonormal block basis AA0 (AA1) is generated by a three-term block Lanczos recurrence:

AA2

with symmetric block tridiagonal matrix AA3 capturing the projected action of AA4:

AA5

This yields a block-tridiagonal pencil AA6 for shifted operators.

The block-Lanczos process provides a surrogate for quantities such as

AA7

via a matrix S-fraction (Stieltjes continued fraction) in terms of AA8:

AA9

where BB0 selects the first BB1 rows. This S-fraction encapsulates the exactness of the first BB2 block moments of the spectral measure associated with BB3 (Zimmerling et al., 2024).

2. Block Gauss and Gauss-Radau Quadrature, Monotonicity, and Error Bounds

BOLT achieves high fidelity through the use of block Gauss and block Gauss-Radau quadrature rules. The block Gauss-Lanczos estimate is

BB4

while block Gauss-Radau augments the last diagonal block to interpolate at a prescribed endpoint (typically zero), yielding a modified block

BB5

and corresponding BB6. The rule is

BB7

A fundamental property is the Loewner monotonicity:

BB8

for BB9, allowing a two-sided, fully computable a-posteriori error bound:

ϕ\phi0

A norm bound,

ϕ\phi1

is available, with additional bounds in terms of Ritz values and extremal eigenvalues (Zimmerling et al., 2024, Xu et al., 2022).

3. Hermite–Padé and Kreĭn–Nudelman Extensions for Dense Spectra

In regimes where ϕ\phi2 approximates a continuous-spectrum operator (e.g., PDEs on unbounded domains), classical block Gaussian quadrature may exhibit only linear convergence in ϕ\phi3 due to discrete approximation stagnation. BOLT introduces a Kreĭn–Nudelman-type adaptive low-rank modification to the block-Lanczos matrix, parameterized by ϕ\phi4, in the spirit of Hermite–Padé approximants for branch-cut functions:

ϕ\phi5

modifies the final denominator of the S-fraction. The resulting quadrature

ϕ\phi6

with

ϕ\phi7

restores exponential convergence under analyticity away from the branch cut; i.e.,

ϕ\phi8

with ϕ\phi9 depending on AA0 and the spectral interval (Druskin et al., 9 Apr 2025). Limits recover Gauss and Gauss–Radau as AA1 and AA2 respectively. The method thereby emulates an “absorbing” boundary, pushing spurious poles away and dramatically reducing transfer-function errors for dense spectral problems.

4. Stochastic BOLT, Trace Estimation, and Access-Constrained Variants

BOLT generalizes to unbiased stochastic estimation of traces of matrix functions via block-orthonormal random probes:

  • Draw AA3 (e.g., Rademacher or Gaussian), thin-QR to obtain AA4, then execute AA5 block Lanczos steps and form AA6.
  • The estimator

AA7

where AA8, is unbiased for AA9, with variance Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}0 over Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}1 independent probes (Yeon et al., 18 May 2025).

Block-orthonormality (“self-averaging”) yields Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}2 convergence in matvecs, robust even in flat-spectrum regimes where methods like Hutch++ degrade to Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}3. The method extends to Subblock SLQ for situations with only partial (submatrix) access to Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}4, guaranteeing unbiasedness and enabling localization for polynomial filters (Yeon et al., 18 May 2025).

Applications include:

  • Log-determinant and matrix norm estimation,
  • Distributional divergence proxies (KL, Wasserstein-2) between possibly singular or partially observed covariances,
  • Regularization in statistical learning.

5. Implementation, FLOP Counts, and Practical Considerations

Each BOLT iteration requires Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}5 time for the sparse matrix–block product and Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}6 for orthogonalization. Total cost for Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}7 Lanczos steps is Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}8 and storage cost is Km(A,B)=span{B,AB,A2B,,Am1B}\mathcal{K}_m(A, B) = \mathrm{span}\{B, AB, A^2B, \ldots, A^{m-1}B\}9 (or just ARn×nA \in \mathbb{R}^{n\times n}0 for the tridiagonal pencil under short-recurrence strategies).

On the small block-tridiagonal matrix ARn×nA \in \mathbb{R}^{n\times n}1 (or shifted variant), evaluation of matrix functions, spectral decompositions, or block system solves is ARn×nA \in \mathbb{R}^{n\times n}2 per shift, but ARn×nA \in \mathbb{R}^{n\times n}3, and structure can be exploited. For trace estimation, diagonalization of the ARn×nA \in \mathbb{R}^{n\times n}4 tridiagonal is dominant, but block size accelerates convergence and enables tradeoffs between memory and matvecs (Yeon et al., 18 May 2025).

Practical guidance includes ARn×nA \in \mathbb{R}^{n\times n}5 for block size, monitoring ARn×nA \in \mathbb{R}^{n\times n}6 for breakdown/deflation, and balancing spectral clustering with per-iteration cost (Xu et al., 2022, Zimmerling et al., 2024). Averaging of Gauss and Gauss–Radau approximants, or log-averaging in MIMO settings, further accelerates error decay at no additional matrix–vector product cost (Zimmerling et al., 2024).

6. Numerical Validation and Applications

BOLT demonstrates strong empirical performance on canonical large-scale discretizations:

  • Heat-diffusion PDEs: e.g., for ARn×nA \in \mathbb{R}^{n\times n}7 (block size ARn×nA \in \mathbb{R}^{n\times n}8) on a ARn×nA \in \mathbb{R}^{n\times n}9 grid, BOLT achieves error BRn×pB \in \mathbb{R}^{n\times p}0 at BRn×pB \in \mathbb{R}^{n\times p}1 where block-Gauss plateaus at BRn×pB \in \mathbb{R}^{n\times p}2 (Druskin et al., 9 Apr 2025, Zimmerling et al., 2024).
  • Maxwell’s equations: for BRn×pB \in \mathbb{R}^{n\times p}3, BRn×pB \in \mathbb{R}^{n\times p}4, BOLT yields BRn×pB \in \mathbb{R}^{n\times p}5 smaller error than Gauss or Gauss–Radau, with preserved monotonicity and faster error decay (up to two orders of magnitude improvement) (Druskin et al., 9 Apr 2025).
  • Graph Laplacians: accurate estimation is validated in SNAP network experiments.

In all benchmarks, the a-posteriori error bounds are sharp, extrapolation via averaging yields acceleration by factors of BRn×pB \in \mathbb{R}^{n\times p}6–BRn×pB \in \mathbb{R}^{n\times p}7, and block enrichment further reduces iterations at minor extra cost (Zimmerling et al., 2024, Druskin et al., 9 Apr 2025).

7. Theoretical and Methodological Significance

BOLT unifies and generalizes block Krylov methods, numerical quadrature, and stochastic trace estimation, with rigorous error control, provable unbiasedness, and robust performance in both clustered and continuous spectra regimes. By incorporating Hermite–Padé corrections and block-orthonormal random probing, the method avoids the convergence pathologies of traditional Krylov, Hutchinson, and sketching methods, and flexibly adapts to hardware and access constraints.

BOLT is supported by theoretical analysis of convergence rates, monotonicity, a-posteriori error, and by potential-theoretic acceleration, with applications ranging from sparse PDEs and operator discretizations to high-dimensional statistics and machine learning (Druskin et al., 9 Apr 2025, Zimmerling et al., 2024, Yeon et al., 18 May 2025, Xu et al., 2022).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Block-Orthonormal Lanczos Quadrature (BOLT).