Papers
Topics
Authors
Recent
Search
2000 character limit reached

Gaussian Process Convolution Models (CGPCM)

Updated 20 January 2026
  • CGPCM is a nonparametric framework that synthesizes Gaussian process theory with convolutional stochastic processes to model causal, non-smooth dynamical systems.
  • It employs causal convolution filters to flexibly represent covariance structures and spectral properties in time series, spatial, and spatio-temporal data.
  • CGPCMs extend to multivariate outputs and count data regression with efficient state-space representations and advanced Bayesian inference methods.

Gaussian Process Convolution Models (CGPCM) form a broad family of nonparametric probabilistic modeling frameworks for stationary, often non-smooth, causal dynamical processes in time and space-time. These models synthesize the classical Gaussian process (GP) machinery with convolutional stochastic process theory, introducing flexibility in the representation of covariance structure, non-separability, and causality. CGPCMs have seen successful application to count data regression, complex time series, and spatio-temporal dynamical systems.

1. Mathematical Construction and Core Principles

A Gaussian Process Convolution Model (GPCM) generates a real-valued process f(t)f(t) (or f(x,t)f(x, t) in space-time) as the output of a stochastic linear filter applied to a base driving noise, typically white noise w(t)w(t): f(t)=0h(τ)w(tτ)dτf(t) = \int_{0}^{\infty} h(\tau) w(t - \tau) d\tau where h(τ)h(\tau) is a causal, nonparametric filter, such that h(t)=0h(t) = 0 for t<0t<0, and w(t)GP(0,δ(tt))w(t) \sim \mathcal{GP}(0, \delta(t-t')) (Bruinsma et al., 2018, Bruinsma et al., 2022). This construction generalizes naturally to spatial or spatio-temporal domains via multi-dimensional convolution: f(x,t)=K(xs,tu)dW(s,u)f(x, t) = \iint K(x-s, t-u) dW(s, u) where WW is a Brownian sheet with possibly spatial correlation (Zhang et al., 1 Dec 2025).

In multi-output or multivariate settings, each output fi(x)f_i(x) can be constructed by convolving a shared and/or private noise process with a designated kernel GiG_i: fi(x)=Gi(xt)u(t)dt,i=1,,mf_i(x) = \int G_i(x - t) u(t) dt, \quad i = 1, \ldots, m This yields rich, flexible marginal and cross-covariance structures, essential for modeling dependent multivariate outputs (Sofro et al., 2017).

The corresponding covariance structure is given, after marginalizing the noise, by closed-form integrals: Cov[fi(x),fj(x)]=Gi(xt)Gj(xt)ku(t,t)dtdt\mathrm{Cov}\left[f_i(x), f_j(x')\right] = \iint G_i(x - t) G_j(x' - t') k_u(t, t') dt dt' for general driving process covariance ku(t,t)k_u(t, t') (Sofro et al., 2017).

2. Covariance Functions, Causality, and Spectral Structure

The CGPCM induces a covariance for ff given by a "one-sided" autocorrelation of the nonparametric filter: kf(Δ)=0h(u)h(u+Δ)duk_f(\Delta) = \int_{0}^{\infty} h(u) h(u + |\Delta|) du where hh is a priori a GP itself, with kernel khk_h restricted to the nonnegative real line (Bruinsma et al., 2018).

This construction enforces causality, as h(t)=0h(t) = 0 for t<0t<0, thereby admitting only those GP covariances whose spectral factorization involves analytic (causal) transfer functions H(iω)H(i\omega) in the upper half-plane. The spectral density is: Sf(ω)=H(iω)2S_f(\omega) = |H(i\omega)|^2 with H(iω)H(i\omega) the Laplace transform of hh over [0,)[0, \infty) (Bruinsma et al., 2018, Bruinsma et al., 2022). This bias toward causal kernels sharpens spectral peaks and provides physically meaningful priors for time series that are generated by causal mechanisms.

Distinct choices for hh and its prior khk_h lead to GPCMs of varying "roughness". For instance, setting h(0)=0h(0)=0 produces smoother sample paths, while h(0)0h(0)\neq 0 yields sample paths that are almost surely nowhere differentiable, resembling Brownian motion (Bruinsma et al., 2022). In frequency terms, this modifies the high-frequency decay of Sf(ω)S_f(\omega) and enables modeling both smooth and non-smooth signals—overcoming the limitations of classical spectral mixture or squared exponential kernels.

3. Extensions: Multivariate Outputs, Space-Time, and Count Models

CGPCMs encompass generalizations that address multi-output, spatio-temporal, and non-Gaussian contexts:

a) Multivariate/Dependent Count Regression.

The multivariate Convolved Gaussian Process (CGP) model for count data regression constructs each output as a sum of shared and individual CGPs:

  • Shared component: ξa(x)=ha(xt)y0(t)dt\xi_a(x) = \int h_a(x-t) y_0(t) dt
  • Individual component: ηa(x)=ga(xt)ya(t)dt\eta_a(x) = \int g_a(x-t) y_a(t) dt with Ta(x)=ξa(x)+ηa(x)T_a(x) = \xi_a(x) + \eta_a(x). The covariance and cross-covariance entries are explicitly: Cov[Ta(x),Ta(x)]=Cov[ξa(x),ξa(x)]+Cov[ηa(x),ηa(x)]\mathrm{Cov}[T_a(x), T_a(x')] = \mathrm{Cov}[\xi_a(x), \xi_a(x')] + \mathrm{Cov}[\eta_a(x), \eta_a(x')]

Cov[T1(x),T2(x)]=Cov[ξ1(x),ξ2(x)]\mathrm{Cov}[T_1(x), T_2(x')] = \mathrm{Cov}[\xi_1(x), \xi_2(x')]

A multivariate Poisson observation model: za,iTa(xa,i)Poisson(λa,i),logλa,i=Ua,iTβa+Ta(xa,i)z_{a,i} \mid T_a(x_{a,i}) \sim \mathrm{Poisson}(\lambda_{a,i}), \quad \log\lambda_{a,i} = U_{a,i}^T \beta_a + T_a(x_{a,i}) permits flexible modeling of count data with dependent outputs (Sofro et al., 2017).

b) Spatio-Temporal and Nonseparable Covariances.

Space-time CGPCMs are constructed using convolution integrals over both space and time, often with kernels such as: K(ξ,τ)=exp(ητ)gs(ξ;μτ,Στ)Iτ0K(\xi, \tau) = \exp(-\eta \tau) g_s(\xi; \mu \tau, \Sigma \tau) I_{\tau \ge 0} resulting in nonseparable, closed-form covariances, linked to solutions of stochastic partial differential equations (SPDEs) (Zhang et al., 1 Dec 2025).

c) Rough GPCM (RGPCM).

By relaxing the nature of the driving noise (from white to e.g. OU/Matérn–1/2) and/or adopting "rough" causal filters, one obtains RGPCMs, generalizing the fractional Ornstein-Uhlenbeck process and admitting maximally non-smooth sample paths (Bruinsma et al., 2022).

4. State-Space Representations and Dynamical Interpretation

CGPCMs possess equivalent infinite-dimensional linear state-space, or stochastic PDE, representations: tf(s,t)=μf+12(Σf)ηf+ϵ(s,t)\partial_t f(s, t) = -\mu^\top \nabla f + \tfrac{1}{2} \nabla \cdot (\Sigma \nabla f) - \eta f + \epsilon(s, t) These SPDEs can be projected onto a finite-dimensional basis (e.g., Fourier modes) using Galerkin methods, resulting in finite SDE representations: dθ(t)=ANθ(t)dt+BNdξ(t)d\theta(t) = A_N \theta(t) dt + B_N d\xi(t) where ANA_N, BNB_N are computable from the projection, and θ(t)\theta(t) collects the coefficients. Such reductions make real-time state estimation (using Kalman filtering) feasible for moderate-dimensional approximation (NN in the tens to hundreds) (Zhang et al., 1 Dec 2025).

Gradients with respect to hyperparameters are tractable via closed-form expressions, and model selection can be performed efficiently in this dynamical framework. The monitoring of derivatives, anomaly detection, and process change-point identification are direct consequences of this structure.

5. Inference Methodologies

Inference in CGPCMs leverages advanced approximate methods due to the intractability of exact Bayesian calculation over function-valued latent variables:

  • Structured Variational Inference (SVI): Employs inducing variables for both the filter hh and the (transformed) base noise, using structured mean-field (SMF) or mean-field (MF) approximations. SMF retains posterior dependencies and sharpens the evidence lower bound (ELBO) (Bruinsma et al., 2018, Bruinsma et al., 2022).
  • Gibbs Sampling for Optimal variational Solutions: The SVI approach is further improved via a direct Gibbs sampler on the block-inducing variables (u,z)(u, z), sampling alternately from q(uz)q^*(u|z) and q(zu)q^*(z|u), bypassing gradient-based optimization and preserving posterior correlations (Bruinsma et al., 2022).
  • Laplace Approximation: For count data with Poisson likelihood, Laplace approximation finds the mode of the unnormalized log-posterior and approximates the log-marginal likelihood via the Hessian at the mode (Sofro et al., 2017).
  • Kalman Filtering: In the state-space (SDE) formulation, Kalman filtering and smoothing enable efficient inference and prediction for discretized state trajectories and observations (Zhang et al., 1 Dec 2025).

Hyperparameters (such as kernel amplitudes, length scales, decay, and others) are typically learned by maximizing the relevant ELBO or Laplace-approximated marginal likelihood, with gradients accessible from the chosen approximation.

6. Predictive Distributions, Uncertainty Quantification, and Empirical Evaluation

Posterior predictive distributions at new input locations are derived through conditioning on the inducing variables and observations. For variational or Gibbs methods, the prediction for f(t)f(t_*) integrates over the auxiliary inducing variables, yielding mixture-of-Gaussians posteriors. Predictive mean and variance follow by standard Gaussian conditioning: E[f(t)]=[ku,kz](Ku0 0Kz)1(u z)\mathbb{E}[f(t_*)] = [k_{*u}, k_{*z}] \begin{pmatrix} K_u & 0 \ 0 & K_z \end{pmatrix}^{-1} \begin{pmatrix} u \ z \end{pmatrix} where expectations are averaged over variational/Gibbs samples (Bruinsma et al., 2022, Bruinsma et al., 2018).

Experiments robustly demonstrate the empirical benefits of the CGPCM framework:

  • In synthetic AR(2) and real hydrology data, the CGPCM achieves up to 20% lower MSE and about +0.4 nats per datapoint higher predictive log-likelihood than noncausal GPCM and standard GP kernels (Bruinsma et al., 2018).
  • The causal assumption in CGPCM reduces mean log loss (MLL) by \sim0.5 nats relative to acausal GPCM, and RGPCM achieves additional gains in complex financial time series (Bruinsma et al., 2022).
  • The Gibbs sampler maintains accurate posterior uncertainty, avoiding the under-calibration seen in mean-field approximations (Bruinsma et al., 2022).
  • Spatio-temporal CGPCMs are effective for modeling, monitoring, and anomaly detection, e.g., in wildfire aerosol remote-sensing, with explicit derivative tracking via the state-space formulation (Zhang et al., 1 Dec 2025).
  • The multivariate CGP regression for counts provides accurate estimation and prediction across multiple outputs, flexibly modeling shared and individual structures (Sofro et al., 2017).

7. Theoretical Guarantees and Broader Connections

All finite blockwise combinations of CGPCM-induced covariances are positive definite, ensuring correct GP structure (Sofro et al., 2017). The spectral bias toward causal transfer functions restricts the class of admissible kernels, enhancing model interpretability and aligning with physical principles in dynamical systems (Bruinsma et al., 2018).

CGPCM encompasses and generalizes the broader family of GP convolution models developed for multi-output and spatio-temporal tasks, subsuming classical works (e.g., Álvarez, Boyle & Frean), and extending them with causal, non-smooth, and non-Gaussian capabilities (Bruinsma et al., 2018, Sofro et al., 2017). RGPCM further connects with Bayesian nonparametric generalizations of fractional Ornstein-Uhlenbeck processes, enabling nonparametric spectral modulation (Bruinsma et al., 2022).

The state-space interpretation links CGPCM to stochastic PDEs, with the convolution GP law matching the law of the SPDE solution, yielding a coherent framework for both computational efficiency and theoretical analysis (Zhang et al., 1 Dec 2025).


Key References:

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Gaussian Process Convolution Models (CGPCM).