Predictive Spatial Covariances

Updated 21 November 2025

Predictive Spatial Covariances are defined as the joint uncertainty measures for spatial predictions based on observed data and a statistical model.
They underpin methods in kriging, Bayesian inference, and machine learning by enabling optimal interpolation and accurate uncertainty mapping.
Recent advances integrate non-Euclidean domains and deep neural networks, ensuring scalability, model validity, and robust practical applications.

Predictive spatial covariances quantify the joint uncertainty of spatial predictions at multiple sites, conditional on observed data and a fitted model. In spatial statistics, machine learning, and related fields, predictive spatial covariances underlie principled uncertainty quantification, enable optimal spatial interpolation (kriging), drive spatial simulation, and inform experimental design. The structure, estimation, and use of predictive spatial covariances depend on the stochastic model class (e.g., Gaussian processes, Markov random fields, deep neural surrogates), the nature of spatial dependence (stationary, nonstationary, separable or not, Euclidean or manifold domains), and the inferential or algorithmic framework (frequentist, Bayesian, or post‐hoc). This article covers foundational results, algorithmic approaches, emerging models (including deep learning and non-Euclidean domains), and explicit methods for predictive covariance computation, drawing on modern research literature.

1. Foundations: Definition, Role, and General Principles

Predictive spatial covariance matrices express the joint uncertainty of spatial predictions at new locations, conditional on observed data. For a spatial field $Z(s)$ observed at $s_1,\ldots,s_n$ and to be predicted at $s^*_1,\ldots,s^*_m$ , the fundamental object is the conditional covariance

$\operatorname{Cov}(Z^*(S^*), Z^*(S^*) \mid Z(S), \theta)$

where $Z^*(S^*) = (Z(s^*_1),\dots,Z(s^*_m))$ and $\theta$ collects model parameters.

In a stationary Gaussian process with known mean and covariance $C(\cdot,\cdot ; \theta)$ , the joint law of observed and unobserved is multivariate normal, and the predictive covariance is

$\Sigma^* = \Sigma_{**} - \Sigma_{*o}\Sigma_{oo}^{-1}\Sigma_{o*}$

where $\Sigma_{oo}$ , $\Sigma_{*o}$ , $\Sigma_{o*}$ , and $\Sigma_{**}$ are the relevant covariance blocks (Cressie et al., 2015).

This formula underlies kriging variance, cokriging, and all linear optimal interpolation—key to spatial prediction (Cressie et al., 2015).

2. Predictive Covariances in Gaussian and Multivariate Fields

For multivariate spatial fields $Z(s) = [Z_1(s), \dots, Z_p(s)]$ with joint covariance blocks $C_{ij}(h)$ , the predictive spatial covariance generalizes to a block matrix, with cross-covariances between field components and among prediction sites.

Multivariate Co-kriging Covariances

Given data $\boldsymbol{Z}_n$ at $n$ sites,

$\widehat{Z}_{j;n}^{\rm CK}(s^*) = \operatorname{Cov}(Z_j(s^*), \boldsymbol{Z}_n) \operatorname{Var}(\boldsymbol{Z}_n)^{-1} \boldsymbol{Z}_n$

with mean squared prediction error

$\operatorname{Var}(\widehat{Z}_j(s^*) - Z_j(s^*)) = R_{jj}(0) - \operatorname{Cov}(Z_j(s^*), \boldsymbol{Z}_n)\operatorname{Var}(\boldsymbol{Z}_n)^{-1}\operatorname{Cov}(\boldsymbol{Z}_n, Z_j(s^*))$

where $R(h) = [R_{ij}(h)]_{i,j=1}^p$ (Bachoc et al., 2020, Cressie et al., 2015).

For arbitrary multivariate fields, covariance models must ensure nonnegative-definiteness of each block for all lags. Linear models of coregionalization (LMC) and full/parsimonious multivariate Matérn families are widely used for this purpose. The cross-covariance structure is crucial: invalid or misspecified models can yield negative predictive variances or inadmissible joint predictions (Cressie et al., 2015).

3. Bayesian and Nonstationary Predictive Covariances

Bayesian Predictive Covariances

In Bayesian spatial models, predictive spatial covariances must integrate parameter uncertainty: $C_{\rm pred}(s^*, s^{*\prime}) = \int \left[ C(s^*, s^{*\prime}; \theta) - C(s^*, S_o; \theta)\Sigma_{oo}(\theta)^{-1}C(S_o, s^{*\prime}; \theta) \right] p(\theta | y_o) d\theta$ where $p(\theta | y_o)$ is the posterior for the covariance parameters. This "posterior predictive covariance" captures both process and parameter uncertainty (Erbisti et al., 2017).

Bayesian MCMC or variational inference is used to sample or average over uncertainty in covariance function parameters, regression coefficients, and hyperparameters. Nonseparable cross-covariance models and latent dimension/convex mixture constructions have been developed for nonstationary or nonseparable multivariate covariance structure (Erbisti et al., 2017).

Nonstationary and Nonparametric Covariances

Nonstationary predictive covariances arise in SPDE-based models and nonparametric Cholesky approaches:

In SPDE frameworks (Fuglstad et al., 2013), the local operator coefficients $\kappa(s)$ , $\mathbf H(s)$ encode spatially varying range and anisotropy, yielding a sparse GMRF precision matrix whose inverse gives predictive covariances for arbitrary configurations.
In nonparametric Bayesian approaches, the sparse Cholesky factor $L$ of the precision is directly modeled, with prior shrinkage motivated by Matérn-type decay, and the posterior predictive covariance at new sites is obtained via the standard GP conditional formula but using the estimated or sampled Cholesky factors (Kidd et al., 2020).

These methods permit scalable calculation (linear–nearly linear in $n$ ), handle large-scale and highly nonstationary spatial phenomena, and support explicit computation or sampling from the full predictive spatial covariance structure (Kidd et al., 2020, Fuglstad et al., 2013).

4. Predictive Covariances in Machine Learning and Deep Neural Surrogates

Standard machine and deep learning models often ignore spatial correlation, leading to miscalibrated prediction variance. Recent advances "adjust" spatial predictions and their covariances via two strategies:

Spatial Decorrelation Transforms (e.g., Vecchia adjustment): Preprocessing spatial data via a linear transform (approximate Cholesky of a reference covariance), fitting any ML/DL method to decorrelated data, then reintroducing spatial dependence by inverting the transform; the predictive covariance at new locations $u$ is

$\operatorname{Var}[\widehat{Y}(u)] = T^{-1} \operatorname{Var}[\widehat{U}(u)] (T^{-1})'$

where $T$ is the decorrelation operator (Heaton et al., 5 Oct 2024).

Spatial Deep CNNs: Use of spatial basis functions or learned embeddings (Radial Basis Functions, multiresolution grids) at input level, followed by a deep CNN with dropout. Uncertainty quantification is performed by MC-dropout; the predictive spatial covariance at multiple sites is estimated empirically from T stochastic forward passes: $\operatorname{Cov}(y(s_1^*), y(s_2^*)) \approx \frac{1}{T}\sum_{t=1}^{T}[O^{(t)}(s_1^*)-\bar{y}(s_1^*)][O^{(t)}(s_2^*)-\bar{y}(s_2^*)]$ (Wang et al., 11 Sep 2024). This Monte Carlo estimator recovers the full covariance among predicted values, allowing spatially varying credible intervals and uncertainty maps.

These methods scale to tens or hundreds of thousands of spatial points and permit seamless integration into modern nonparametric learning pipelines.

5. Covariance Estimation, Sphere and Network Geometries, and Computational Issues

Spherical and Non-Euclidean Domains

When spatial fields are defined over the sphere (Earth or celestial domain), predictive covariances must be built from kernels that are positive-definite with respect to great-circle distance. Euclidean-based models can induce severe distortion at large scales; sphere-native covariance families (e.g., spherical Matérn, sine-power, Wendland) allied with kriging or co-kriging formulas yield valid predictive spatial covariances across the globe (Jeong et al., 2015, Alegría et al., 2017). For multivariate fields, "asymmetric" covariances can be constructed by spatial rotations that yield improved predictive performance (Alegría et al., 2017).

Network Domains and Nonseparable Spatiotemporal Models

On spatial networks (graphs, trees, stream networks), predictive space-time covariances require careful construction. Generalized Gneiting classes, 1-symmetric metric models, and scale mixture approaches provide parametric families for kriging and variance prediction. The corresponding covariance, built from valid network metrics and temporal functions, is inserted into the standard kriging equations, with all the familiar linear algebra (Tang et al., 2020, Hanks, 2015).

Computational Approaches

For large non-gridded datasets, hierarchical, multilevel, and sparse Cholesky methods are employed to obtain REML estimates and perform fast kriging prediction. Key features:

Multi-level contrasts remove fixed effects and induce fast off-diagonal decay in the transformed covariance, making only local blocks necessary and enabling fast Cholesky or PCG (Castrillon-Candas et al., 2015).
Sparse factorization and preconditioned Krylov subspace methods yield kriging variances/covariances and log-likelihood gradients with O(n^{3/2}) cost in 2D, O(n²⁾ in 3D (Castrillon-Candas et al., 2015, Kidd et al., 2020).

6. Application Domains and Extensions

Spatial Feature Detectors (Computer Vision)

Spatial covariances for local features in images/data are modeled with detector-specific or structure-tensor-based estimates (full, anisotropic, or isotropic), which are linked to the Fisher information and Cramér-Rao bound of the detection process. These predictive covariances propagate into geometric vision downstream tasks (PnP, bundle adjustment), yielding theoretical and empirical gains in accuracy when used as uncertainty weights (Tirado-Garín et al., 2023).

Spatiotemporal and Volatility Models

Spatiotemporal GARCH models for volatility forecasting require predictive spatial covariances both for the contemporaneous innovations (typically modeled as spatially correlated Gaussian fields) and for the propagation of uncertainty through GARCH recursion, via kriging of innovations and recursive computation of multi-step conditional variances and covariances (Aouri et al., 11 Aug 2025).

Non-Gaussian and PDF Covariances (Cosmological applications)

Predictive covariances for binned PDF statistics in cosmology are constructed using two-point PDF models (e.g., shifted lognormal), yielding the full spatial covariance of histogram/count estimates including shot-noise and super-sample (global) variance effects (Uhlemann et al., 2022). These models permit likelihood-based inference on one-point density statistics in surveys, Fisher forecasting, and proper error propagation into downstream astrophysical/cosmological analyses.

7. Open Challenges and Research Directions

Parameter uncertainty and plug-in bias: Plugging in point estimates of model hyperparameters systematically underestimates predictive variance, especially for small-sample or weakly identified settings (Rivera et al., 2014). Covariance-penalty and parametric bootstrap methods have been proposed to correct or bound this error.
Model validity and flexibility: Ensuring nonnegative-definiteness of covariance/cross-covariance families remains crucial; many machine learning methods either ignore or approximate the spatial structure, requiring explicit modeling or adjustment layers for valid uncertainty quantification (Cressie et al., 2015).
Scalability and adaptivity: Multiresolution, hierarchical, and sparse–approximation techniques are central to maintaining tractability in high dimensions and complex domains (Wang et al., 11 Sep 2024, Kidd et al., 2020).
Manifold and network generalization: Effective models for the sphere, product domains, and general networks are an area of active development, especially to ensure physically realistic prediction and uncertainty in geosciences, climatology, and spatially indexed finance (Jeong et al., 2015, Tang et al., 2020).

Summary Table: Predictive Covariance Methods Across Domains

Domain/Model	Predictive Covariance Formula	Scalability
Gaussian processes (stationary)	$\Sigma^* = \Sigma_{*} - \Sigma_{o}\Sigma_{oo}^{-1}\Sigma_{o*}$	$O(n^3)$
Multivariate co-kriging	Block formula, plug-in or Bayesian estimation	$O(p^3n^3)$
Nonstationary/SPDE (GMRF approx.)	$\Cov(u(s^), u(s)) = e^ Q_C^{-1} e$	$O(n^{3/2})$ in 2D
Bayesian Cholesky/Vecchia	Conditional formula (with sampled Cholesky)	$O(nm)$ or linear
Deep neural surrogate (SDCNN)	Empirical MC (MC-dropout ensemble)	$O(T M)$ , T = #MC
Machine learning (spatial transform)	$\Var(\widehat{Y}) = T^{-1} \Var(\widehat{U}) (T^{-1})^\top$	$O(n C^2)$
Spherical/network domains	Sphere- or network-valid covariance in kriging	$O(n^3)$ , possible sparsity

Predictive spatial covariances are fundamental to spatial statistics, spatial machine learning, and uncertainty quantification in scientific modeling. Their computation and interpretation depend on model validity, statistical framework, and algorithmic scalability. Current research advances address non-Euclidean geometry, high-dimensionality, model misspecification, and integration with modern machine learning architectures.