Papers
Topics
Authors
Recent
Search
2000 character limit reached

Gaussian Process Quadrature (GPQ)

Updated 26 February 2026
  • Gaussian Process Quadrature (GPQ) is a probabilistic numerical integration method that models integrals via Gaussian processes, yielding both point estimates and calibrated uncertainty.
  • It employs various node selection strategies—random, deterministic, and active learning—to enhance accuracy and efficiency in estimating integrals.
  • GPQ generalizes classical quadrature rules, offering rigorous theoretical convergence rates and practical guidelines for kernel selection and scalable computational methods.

Gaussian Process Quadrature (GPQ) is a probabilistic framework for numerical integration that models the unknown integrand as a sample from a Gaussian process (GP). By leveraging the GP posterior conditional on observed function evaluations, GPQ delivers both point estimates and quantified uncertainty for integrals, which are crucial in Bayesian inference, filtering, uncertainty propagation, and other scientific computing domains. This approach subsumes classical quadrature rules under specific kernel and node choices and introduces a systematic methodology for model-based integration with rigorous theoretical and practical guarantees (Mahsereci et al., 18 Feb 2026).

1. Mathematical Model and Core Formulation

GPQ addresses the estimation of integrals of the form

I=Df(x)p(x)dx,I = \int_D f(x) p(x) \, dx,

where DRdD \subseteq \mathbb{R}^d, p(x)p(x) is a known density, and ff is unknown but evaluated at a finite set of points. GPQ imposes a Gaussian process prior on ff,

fGP(m(),k(,)),f \sim \mathcal{GP}(m(\cdot), k(\cdot,\cdot)),

with mean function m:DRm: D \to \mathbb{R} and positive-definite covariance kernel k:D×DRk: D \times D \to \mathbb{R}. Conditioned on NN noiseless evaluations yi=f(xi)y_i = f(x_i) at nodes X={x1,,xN}X = \{x_1, \ldots, x_N\}, the posterior is again a GP: μN(x)=m(x)+kxXK1(ymX), ΣN(x,x)=k(x,x)kxXK1kXx,\begin{aligned} \mu_N(x) &= m(x) + k_{xX} K^{-1}(y - m_X), \ \Sigma_N(x,x') &= k(x,x') - k_{xX} K^{-1} k_{Xx'}, \end{aligned} where KK is the kernel matrix on XX, kxX=(k(x,x1),,k(x,xN))k_{xX} = (k(x,x_1),\ldots,k(x,x_N)), and mXm_X is the vector of prior means at XX.

The posterior distribution for II is Gaussian, with explicit expressions: kP,X=(k(x1,x)p(x)dx,,k(xN,x)p(x)dx)T, mP=m(x)p(x)dx, kPP=k(x,x)p(x)p(x)dxdx,\begin{aligned} k_{P,X} &= \left(\int k(x_1,x) p(x) dx, \ldots, \int k(x_N,x) p(x) dx\right)^T, \ m_P &= \int m(x) p(x) dx, \ k_{PP} &= \iint k(x,x') p(x) p(x') dx dx', \end{aligned} so that

E[Iy]=mP+kP,XTK1(ymX), V[Iy]=kPPkP,XTK1kP,X.\begin{aligned} \mathbb{E}[I \mid y] &= m_P + k_{P,X}^T K^{-1}(y - m_X), \ \mathbb{V}[I \mid y] &= k_{PP} - k_{P,X}^T K^{-1} k_{P,X}. \end{aligned}

This provides both an estimator and a calibrated uncertainty measure for II (Mahsereci et al., 18 Feb 2026, Särkkä et al., 2015, Prüher et al., 2017).

2. Node Selection Strategies and Sampling Paradigms

Node selection in GPQ directly controls the efficiency and accuracy of the estimator. Strategies include:

  • Random (Bayesian Monte Carlo): Nodes xix_i sampled i.i.d. from p(x)p(x). Variants employ importance sampling or determinantal point processes to improve coverage or diversity.
  • Space-filling deterministic designs: Regular grids and classical Gaussian quadrature rules (e.g., tensorized Gauss–Hermite), or quasi-Monte Carlo sequences (Sobol, lattice) are used for improved uniformity and optimal convergence for smooth integrands.
  • Active learning (Myopic or Sequential Design): Nodes are selected adaptively by maximizing an acquisition function, such as negative posterior variance, mutual information, or predictive uncertainty:

xn+1=argmaxxU(x),x_{n+1} = \arg\max_x \mathcal{U}(x),

where U(x)\mathcal{U}(x) quantifies value according to the chosen utility criterion. This sequential approach often yields lower posterior variance for a fixed budget (Mahsereci et al., 18 Feb 2026).

3. Taxonomy and Extensions of GPQ

The GPQ methodology can be systematically classified along three axes (Mahsereci et al., 18 Feb 2026):

A. Modelling:

  • Conjugate GPQ: Direct GP prior on ff with Gaussian integral posterior (exact).
  • Warped or Non-conjugate GPQ: Transformations such as f=ϕ(g)f = \phi(g), gGPg \sim \mathcal{GP} (e.g. exponentiation for nonnegative integrands), which necessitate approximate inference techniques.

B. Inference:

  • Exact/Analytic: Closed-form expressions for posterior mean and variance of the integral.
  • Approximate: Required when kernel means or double integrals are intractable; approaches include Laplace approximation, variational Bayes, or kernel quadrature of kernel means.
  • Scalable: Sparse-GP, random Fourier features, or kernel interpolation methods reduce cubic scaling to increase tractability for large NN.

C. Sampling:

  • Static/Model-independent: Node sets chosen without reference to current surrogate model, as in standard grids.
  • Sequential/Active: Nodes chosen adaptively by maximizing acquisition functions, targeting maximal reduction in integration uncertainty.

Non-conjugate strategies and scalable inference are especially relevant for high-dimensional or computationally expensive integrations as in modern Bayesian inference pipelines (Li et al., 2023, Hamid et al., 2021).

4. Theoretical Guarantees and Convergence Rates

Let H(k)H(k) denote the reproducing kernel Hilbert space (RKHS) of the kernel kk. Define the worst-case error

ek(w;X)=supgH1Dg(x)p(x)dxi=1Nwig(xi).e_k(w;X) = \sup_{\|g\|_{\mathcal{H}} \leq 1} \left|\int_D g(x) p(x) dx - \sum_{i=1}^N w_i g(x_i)\right|.

The GPQ weights w=K1kP,Xw = K^{-1} k_{P,X} minimize ek(w;X)e_k(w;X), and the posterior variance Σ\sqrt{\Sigma} equals this minimal RKHS error (Mahsereci et al., 18 Feb 2026).

For fHβ(D)f \in H^\beta(D) with a Matérn kernel of smoothness ν\nu (so that H(k)HαH(k) \approx H^\alpha, α=ν+d/2\alpha = \nu + d/2), when nodes are quasi-uniform with fill distance hNN1/dh_N \approx N^{-1/d} and βα\beta \leq \alpha,

IμNNβ/dfHβ(D),|I - \mu_N| \lesssim N^{-\beta/d}\|f\|_{H^\beta(D)},

attaining the optimal Sobolev rate. For Bayesian Monte Carlo (random nodes),

E[IμN]N1/2\mathbb{E}[|I-\mu_N|] \lesssim N^{-1/2}

for all fHf \in \mathcal{H}, matching classical Monte Carlo but often with smaller constants (Mahsereci et al., 18 Feb 2026).

5. Computational Complexity and Scalability

Classical GPQ has cubic computational complexity in the number of nodes: forming and inverting the kernel matrix KK costs O(N3)O(N^3) flops and O(N2)O(N^2) memory. Kernel means and double integrals (kP,Xk_{P,X}, kPPk_{PP}) require either closed form or secondary quadrature techniques. Numerical stability is enhanced by adding a small diagonal nugget λI\lambda I to KK (Mahsereci et al., 18 Feb 2026).

Scalable variants involve:

  • Sparse-GP surrogates (inducing points) with cost O(M3)O(M^3), MNM \ll N (Li et al., 2023).
  • Random Fourier features and kernel interpolation for efficient kernel approximation (Mahsereci et al., 18 Feb 2026).
  • Quantum acceleration: Quantum Gaussian Process Quadrature combines Hilbert-space kernel approximation with quantum algorithms (phase estimation, qPCA, Hadamard/SWAP tests), yielding gate complexity polylogarithmic in problem size and a polynomial speedup over classical methods for large-scale settings (Galvis-Florez et al., 20 Feb 2025).

6. Practical Guidelines, Applications, and Limitations

Practical Guidance

  • Kernel selection: Choice of kernel (e.g., Matérn with νβd/2\nu \approx \beta-d/2) should match prior smoothness assumptions on ff.
  • Prior mean: Zero mean is generally sufficient; trends can be handled via explicit basis functions.
  • Hyperparameters: Set by marginal likelihood or cross-validation; beware of overfitting, especially in adaptive node schemes. Use an initial space-filling design to mitigate this.
  • Numerical Stability: Add minimal necessary jitter/nugget to KK.
  • Software: Implementations include EmuKit and ProbNum (Python), GAIL (MATLAB), and standard GP libraries (GPy, GPflow, GPyTorch) for custom code.

Applications

GPQ is widely applied for:

Limitations

  • Kernel mean requirement: Closed-form kernel means are available only for particular kernels and measures (notably, Gaussian densities with RBF or polynomial kernels), restricting general applicability.
  • Cubic scaling: For large NN, classical GPQ becomes impractical unless approximate methods are used.
  • Active selection caveats: Overfitting can degrade uncertainty quantification, emphasizing the need for regularization and robust hyperparameter settings.

7. Connections to Classical Quadrature and Sigma-Point Methods

GPQ generalizes and subsumes all classical sigma-point quadrature methods:

Classical Rule GP Kernel Node Set GPQ Specialization
Unscented Transform (UT) Polynomial, deg. 3 $2n+1$ sym. sigmapoints Exact for degree-3 polynomials
Gauss–Hermite (GH) Tensor-poly, deg. $2P-1$ Product of 1D GH roots Exact for degree $2P-1$ polynomials
Cubature (CKF etc.) Poly, deg. 3 Axis-aligned sym. points Relation to high-order, symmetric rules

For suitable kernels and nodes, GPQ recovers classical weights and achieves zero posterior variance when the integrand is in the span of the model. Employing smooth (e.g., squared-exponential) kernels and Bayes-optimal or quasi-random nodes, GPQ often attains lower RMSE and superior uncertainty quantification relative to traditional quadratures in moderate-dimensional, smooth-integration regimes (Särkkä et al., 2015, Prüher et al., 2017). GPQ enables principled construction of new, low-variance, and adaptive quadrature schemes for advanced estimation and filtering applications.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Gaussian Process Quadrature (GPQ).