Optimal Stochastic Krylov based Techniques for Large- Scale Log-Determinant Estimation

Published 5 Jun 2026 in math.NA | (2606.07004v1)

Abstract: Estimating the logarithm of the determinant of large sparse positive definite symmetric matrices is an important task in numerical linear algebra, machine learning, Gaussian processes, and uncertainty quantification. In this work, we introduce two scalable and efficient methods for large-scale log-determinant termed the Optimal Stochastic Arnoldi with Incomplete Orthogonalization Procedure (OSA-IOP) and the Optimal Stochastic Lanczos Quadrature (OSLQ). The OSA-IOP approach extends the Incomplete Orthogonalization Procedure (IOP), originally developed for matrix exponential functions for exponential time stepping integrators, to compute the action of the matrix algorithm on a vector. We observe that combining IOP with a randomized Hutch++ algorithm, the OSA-IOP significantly reduces computational cost while maintaining high accuracy. The OSLQ method estimates log-determinants by coupling Lanczos quadrature with Hutch++ and controlled orthogonalization, leveraging Krylov subspaces as efficient quadrature mechanisms to approximate quadratic forms involving the matrix logarithm. We derive error bounds for both methods. Extensive numerical experiments on large-scale sparse matrices from real-world applications demonstrate the accuracy, robustness, and scalability of the proposed approaches.

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper presents two novel algorithms, OSA-IOP and OSLQ, that integrate incomplete orthogonalization with stochastic trace estimation for efficient log‐determinant computation.
It leverages Krylov subspace projections and advanced variance reduction methods to reduce sample complexity from O(1/ε²) to O(1/ε), significantly accelerating the estimation process.
Numerical experiments demonstrate lower relative errors and reduced computational costs on large-scale sparse SPD matrices compared to traditional methods.

Optimal Stochastic Krylov-Based Techniques for Large-Scale Log-Determinant Estimation

Introduction

Efficient log-determinant estimation of large-scale sparse symmetric positive definite (SPD) matrices is central to a range of computational problems, especially in ML, uncertainty quantification (UQ), and Gaussian process (GP) modeling. The computational bottleneck stems from the inaccessibility of direct factorization approaches (such as Cholesky) in high dimensions, thus necessitating scalable, matrix-free methods focused on stochastic trace approximation and Krylov subspace projection. The paper "Optimal Stochastic Krylov based Techniques for Large- Scale Log-Determinant Estimation" (2606.07004) builds upon these motifs and introduces two new methods: Optimal Stochastic Arnoldi with Incomplete Orthogonalization Procedure (OSA-IOP) and Optimal Stochastic Lanczos Quadrature (OSLQ). These techniques extend and unify adaptive Krylov procedures, stochastic estimation, and variance reduction, yielding algorithms with improved computational and statistical efficiency in log-determinant estimation.

Background and Motivation

Log-determinant computations frequently arise as normalization constants in high-dimensional Gaussian models—dictating model evidence, hyperparameter learning, and Bayesian inference. Traditional matrix decompositions become infeasible for large, sparse SPD matrices, prompting iterative, approximation-based schemes. Foundational to this landscape are:

Krylov subspace methods that use projections to efficiently approximate matrix functions—especially $f(Q)v$ for analytic $f$ , with particular attention to $f(x) = \log(x)$ .
Stochastic trace estimators, such as Hutchinson's method, which allow for scalable trace computation at the expense of stochastic variance.
Lanczos Quadrature, Stochastic Lanczos Quadrature (SLQ), and Hutch++—the latter enabling variance reduction with low-rank deflation and efficient matrix-vector querying.

Nonetheless, existing SLQ and Chebyshev-based methods incur high computational and memory costs due to full basis orthogonalization or slow convergence in ill-conditioned settings. Recent variants, such as block and incomplete orthogonalization Krylov approaches, highlight both the opportunities and bottlenecks of orthogonalization cost and statistical efficiency.

Methodology

OSA-IOP: Arnoldi Incomplete Orthogonalization for Log-Determinant Estimation

OSA-IOP generalizes the KIOPS framework, previously established for efficient computation of matrix exponentials, to the matrix logarithm. The central algorithm is the Arnoldi-IOP-LOG, which applies a short-recurrence (length-2) Incomplete Orthogonalization Procedure (IOP) to generate a Krylov basis $V_m$ for the shifted-scaled matrix $\widetilde Q$ . The log-determinant is then approximated as:

$\log(Q)v \approx \|v\|_2 \, V_m \log(H_m)e_1 + \log(\gamma)v,$

with $H_m$ the upper Hessenberg (projected) representation via IOP, and $\gamma$ a spectrum-centering parameter chosen as the geometric mean of the extremal eigenvalues:

$\gamma^* = \sqrt{\lambda_{\min} \lambda_{\max}}$

This optimally clusters the eigenvalues of $\widetilde Q$ around 1 in the log-domain, minimizing the worst-case error of the Krylov logarithm approximation.

Critically, OSA-IOP integrates this Krylov action with Hutch++ stochastic trace estimation, forming the OSA-IOP estimator for $f$ 0 and thus $f$ 1, while supplying error control by combining polynomial approximation bounds and stochastic variance contraction (see below for error results).

Figure 1: Log-likelihood estimation using OSA-IOP on a large-scale machine learning hyperparameter optimization task.

OSLQ: Optimal Stochastic Lanczos Quadrature

OSLQ is designed to combine SLQ’s Gauss-Lanczos quadrature with explicit variance reduction à la Hutch++. The method proceeds as follows:

Random Subspace Construction: Build a low-rank subspace $f$ 2 using a range finder with Rademacher/Gaussian sketches.
Low-Rank Deflation: Compute the deterministic quadrature contribution in the column-space of $f$ 3.
Residual Quadrature: Estimate the residual trace in the orthogonal complement via additional projected Hutchinson probes, each resolved through fast Gaussian quadrature in their respective Krylov subspaces.
Error Control: Employs Bernstein ellipse polynomial approximation results and explicit stochastic error bounds.

OSLQ thus remains function-agnostic (applicable for any analytic $f$ 4), matrix-free, and avoids the heavy cost of block-Lanczos methods or full function-based subspace construction.

Figure 2: Log-likelihood estimation using OSLQ, illustrating the estimator’s efficacy and convergence across a parameter sweep.

Error Analysis

Both OSA-IOP and OSLQ supply concrete error bounds which integrate:

Best polynomial approximation for analytic functions via Krylov subspaces, drawing upon Bernstein ellipse results.
Stability factors for IOP-based Krylov processes (in OSA-IOP), shown to only marginally inflate per-iteration error constants.
Variance reduction via Hutch++: For both methods, the stochastic estimation’s required sample size is provably reduced to $f$ 5, where $f$ 6 is target error, compared to the $f$ 7 scaling of naïve Hutchinson estimation.
Total combined error for the log-determinant is established as:

$f$ 8

for OSA-IOP, and analogous results for OSLQ with $f$ 9 the subspace dimension and statistical guarantees for the overall trace estimator.

Numerical Results

Extensive benchmarking was conducted on diverse real-world SPD matrices (sizes up to $f(x) = \log(x)$ 0) from the UF Sparse Matrix Collection. Key findings include:

Accuracy: OSA-IOP consistently achieves the smallest relative error across medium and large matrices, outperforming SLQ, So-SLQ, and Chebyshev-based competitors on standard reference tasks.
Efficiency: Both OSA-IOP and OSLQ deliver substantial reductions in computational time and number of matrix-vector products required, often by factors of 2–5, while maintaining or exceeding the accuracy of previous state-of-the-art methods.
Scalability: No reliance on explicit factorization or dense matrix storage; robustly handles highly sparse, ill-conditioned, and domain-specific operators.

Applications include grid-based Gaussian random field (GRF) estimation, where the likelihood surface can be accurately traced for hyperparameter selection using both OSA-IOP and OSLQ, confirming practical viability for large-scale ML/UQ tasks.

Implications and Future Directions

These methodologies demonstrate the feasibility of truly scalable log-determinant estimation for large-scale SPD matrices—removing reliance on full orthogonalization and spectral decompositions. This impacts:

Bayesian machine learning: enabling GP kernel learning, evidence maximization, and model selection at previously intractable scales.
Uncertainty quantification: empowering fast computation of information-theoretic risk and evidence in large parameter spaces.
Numerical linear algebra: suggesting further advances in matrix-function approximation, both for other functions and for indefinite or non-symmetric matrices.

A promising direction is the extension to block-structured and multi-vector Krylov methods for further variance reduction or applications in non-SPD or non-analytic settings.

Conclusion

The OSA-IOP and OSLQ algorithms introduced in (2606.07004) deliver advanced, theoretically grounded, and empirically validated frameworks for large-scale log-determinant estimation. By combining incomplete (Arnoldi) orthogonalization, optimal stochastic trace estimation, and structure-exploiting Krylov subspaces, these methods achieve efficiency, accuracy, and scalability unattainable with previous approaches. Their adoption in Bayesian ML, signal processing, and scientific computing is expected to drive further developments, including new hybridizations and performance optimizations for next-generation large-scale inference engines.

Markdown Report Issue