Nonstationary Gaussian Fields

Updated 3 January 2026

Nonstationary Gaussian fields are defined by covariance functions that vary with absolute locations, capturing spatial and temporal inhomogeneity.
They are constructed using process convolutions, SPDE methods, and neural network parameterizations to enable scalable and flexible modeling.
These models are crucial in applications such as climate science, neuroimaging, and machine learning, providing improved uncertainty quantification and predictive accuracy.

A nonstationary Gaussian field is a family of real-valued random processes or spatial fields whose finite-dimensional marginals are multivariate normal but whose covariance structures depend not just on increments but on absolute locations and may vary spatially, temporally, or in more general structured domains. The defining feature is that the covariance kernel $C(x, x') = \mathbb{E}[f(x)f(x')]$ depends on $(x, x')$ in a non-translation-invariant way, allowing location-dependent variance, correlation length, anisotropy, or spectral content. Such fields are central to contemporary spatial statistics, machine learning, signal processing, and the study of physical or biological continua where local structure, regime shifts, or heteroskedasticity are present.

1. Mathematical Definitions and Motivations

Stationarity for a random field $f:\mathbb{R}^d \to \mathbb{R}$ requires that the covariance function $C(x, x')$ depends only on the lag $x-x'$ . Nonstationarity is defined by the explicit dependence of $C$ on both $x$ and $x'$ : $C(x, x') \neq C(x-x').$ This structure accommodates physically or statistically meaningful inhomogeneity: spatially varying variance, directional anisotropies, abrupt regime transitions, or locally changing smoothness and correlation range. Examples include environmental variables (temperature, precipitation), geophysical time series (magnetoencephalography, seismic signals), and numerous engineering or real-world sensor contexts (Booth et al., 2023, Lindgren et al., 2021).

The need for nonstationary models arises because stationary assumptions can induce severe mis-specification, leading to incorrect uncertainty quantification, predictive coverage, and structural artifacts in kriging or surrogate modeling.

2. Principled Covariance Constructions

2.1 Process Convolutions and Nonstationary Matérn Classes

A foundational construction is process convolution. Given a location-varying smoothing kernel $k_x(u)$ and white noise $W$ , the process

$g(x) = \int k_x(u) dW(u)$

has covariance

$C(x, x') = \int k_x(u) k_{x'}(u) du,$

which is manifestly nonstationary and positive-definite (Booth et al., 2023, Nychka et al., 2017). The parametric Paciorek–Schervish covariance generalizes the Matérn class: $K(x, x') = |\Sigma(x)|^{1/4}|\Sigma(x')|^{1/4}\left|\frac{\Sigma(x) + \Sigma(x')}{2}\right|^{-1/2} \exp\left\{ -\tfrac{1}{2}(x-x')^T \left[ \tfrac{\Sigma(x) + \Sigma(x')}{2} \right]^{-1} (x-x')\right\},$ where $\Sigma(x)$ is a local anisotropy matrix (Beckman et al., 2022).

2.2 Stochastic PDEs and GMRF Approximations

A powerful alternative represents nonstationary Matérn-type fields as solutions to SPDEs with spatially varying operators: $\{\kappa(x)^2 - \nabla \cdot H(x) \nabla\}^{\alpha/2} u(x) = \frac{1}{\tau(x)} W(x),$ with $\kappa(x)$ controlling local range, $H(x)$ a local positive-definite matrix encoding anisotropy, and $\tau(x)$ the local marginal scale (Lindgren et al., 2021, Lindgren et al., 2020, Fuglstad et al., 2013). Discretization via finite elements yields sparse Gaussian Markov random fields (GMRFs), admitting scalable inference and simulation on large domains, curved manifolds, and in spatio-temporal extensions.

Finite-element and Galerkin constructions further generalize to surfaces and more general second-order elliptic operators with spatially inhomogeneous diffusion tensors and potentials (Jansson et al., 2024).

2.3 Partition, Mixture, and Deep Models

Axis-aligned partition schemes (Treed GPs), local kriging (laGP), and warped/deep Gaussian process models all induce nonstationarity by allowing regression trees, local neighborhoods, or latent input mappings to alter the local covariance structure (Booth et al., 2023). These approaches are particularly useful where discontinuities or regime shifts are anticipated, or where hierarchical multiscale modeling is appropriate.

2.4 Neural Network Parametrizations

Recent advances model kernel hyperparameters (e.g., local variance, range, or noise level) as outputs of neural networks, enabling highly flexible, data-adaptive nonstationary covariance functions while retaining the analytical GP framework (James et al., 16 Jul 2025, Negarandeh et al., 18 Mar 2025). Hyperparameter learning is performed jointly with the GP under the marginal likelihood, supporting both exact and sparse/inducing-point inference on large datasets and yielding state-of-the-art predictive uncertainty and accuracy across a broad range of machine-learning and spatial benchmarking tasks.

3. Statistical and Geometric Characterization

3.1 Peak Height and Local Maximum Statistics

For a smooth 1D nonstationary field $X$ with covariance $\mathrm{Cov}(X(s), X(t))$ , the peak height distribution at location $t$ depends only on two local parameters: $\tilde{\sigma}(t)^2 = \mathrm{Var}(X(t) \mid X'(t)=0), \qquad \rho(t) = \mathrm{Corr}(X(t), X''(t) \mid X'(t)=0).$ The closed-form density of local maxima above level $u$ is given by

$f_t(x) = \frac{1}{\tilde{\sigma}(t)} \phi\left(\frac{x}{\tilde{\sigma}(t)}\right) \sqrt{2\pi[1-\rho(t)^2]} \psi\left(-\frac{\rho(t)x}{\tilde{\sigma}(t)\sqrt{1-\rho(t)^2}}\right),$

where $\phi$ , $\psi$ are the standard normal pdf and Indefinite integral of the cdf (Zhao et al., 18 Feb 2025, Cheng, 2023).

In multidimensional domains and scale-space fields, similar Kac–Rice formulae imply that all relevant peak statistics can be captured via local covariances and their derivatives, greatly simplifying the interpretation and estimation for change-point detection and FWER control in imaging, cosmology, and related applications (Zhao et al., 18 Feb 2025).

3.2 Excursion Set Geometry: Euler Characteristics

The Gaussian Kinematic Formula for a nonstationary smooth field $X$ on a manifold $M$ expresses

$\mathbb{E}\left[\chi(A_u)\right] = \sum_{k=0}^d L_k(M,C)\,\rho_k(u),$

where $L_k$ are Lipschitz–Killing curvatures determined by the local Riemannian metric induced by the covariance, and $\rho_k(u)$ are Hermite polynomial densities (Telschow et al., 2019). Empirical and bootstrap estimators enable consistent inference in arbitrary nonstationary fields, facilitating FWER/cluster inference for thresholded excursion sets.

4. Computational Strategies and Scalability

The cubic scaling of standard GP inference is a critical limitation for nonstationary models, particularly since the requisite number of hyperparameters is typically much larger. Modern strategies include:

Sparse/inducing-point approximations: Variational inference with inducing points enables scaling to tens or hundreds of thousands of observations for nonstationary models, as in recent neural network-parametrized GPs (James et al., 16 Jul 2025).
Block-diagonal + Low-Rank (BDLR) approximations: For spatial models with basis expansions of $\Sigma(x)$ , block partitioning plus Nyström low-rank corrections yield algebraic covariance approximations suitable for stochastic likelihood/Hessian estimation and Newton-like optimization. Empirical results demonstrate practical scaling to $10^5$ – $10^6$ points with hundreds of parameters (Beckman et al., 2022).
Multiresolution representations: The LatticeKrig approach fits local stationary covariances on sliding windows and then encodes these into a global multi-level basis (Wendland or spherical wavelets) with sparse GMRF priors at each scale. Efficient for simulation and inference up to $10^5$ grid points (Nychka et al., 2017).
Finite-element SPDE solvers: Nonstationary SPDE models discretized with surface or domain-adapted FEMs, possibly with Chebyshev acceleration or eigenfunction expansions, enable time-stepping and strong error bounding even on complex manifolds (Jansson et al., 2024, Lindgren et al., 2020).

5. Model Selection, Inference, and Applications

Rigorous model selection mandates head-to-head comparison of stationary and nonstationary surrogates via predictive scoring (RMSE, CRPS, log-score), cross-validation, and assessment of uncertainty calibration (Booth et al., 2023). Penalized complexity priors on local range and variance prevent overfitting amid high-dimensional parameterizations (Lindgren et al., 2021, Lindgren et al., 2020).

High-impact applications include:

Climate and environmental modeling: Nonstationary Matérn or SPDE-based fields for global temperature analysis, earth system model emulation, and spatial interpolation in heterogeneous domains with physical barriers (Lindgren et al., 2020, Lindgren et al., 2021, Nychka et al., 2017).
Neuroimaging and cosmology: Peak statistics and Euler characteristic inference for cluster correction in fMRI and analysis of cosmological fields (Telschow et al., 2019, Zhao et al., 18 Feb 2025).
Spatio-temporal processes: Diffusion-based or nonseparable spatio-temporal SPDEs, supporting modeling of propagating phenomena and multiscale temporal–spatial correlations (Lindgren et al., 2020).
Machine learning: Surrogate modeling under heteroskedasticity, local regime identification, and learning complex, structured uncertainties in physical or engineering computer experiments (Booth et al., 2023, Negarandeh et al., 18 Mar 2025, James et al., 16 Jul 2025).

6. Advanced Kernels and Multi-Output Extensions

The harmonizable spectral mixture (HSM) kernel constructs nonstationary processes by parameterizing the spectral bimeasure $S(\omega_1, \omega_2)$ as a finite Gaussian mixture on $\mathbb{R}^{2d}$ : $S(\omega_1, \omega_2) = \sum_q W_q \mathcal{N}([\omega_1; \omega_2] | \mu_q, \Sigma_q),$ permitting coherent modeling of nonstationarity, local spectral repurposing, and automatic adaptation to stationary/nonstationary structure during kernel learning. Multi-output extensions (MOHSM) generalize this to vector-valued GPs with output-pair-specific cross-spectral densities, enabling learning of cross-channel delays, phase structure, and adaptive local coupling. Empirical records show MOHSM recovers ground-truth covariances for synthetic, financial, and EEG datasets and consistently outperforms stationary MOGP baselines (Altamirano et al., 2022).

7. Limitations and Research Directions

Computational Scaling: While block-diagonal, sparse, or variational approximations address $O(n^3)$ scaling, extremely high-dimensional hyperparameter spaces (e.g., complex neural kernels or dense spatially varying parameterizations) can stress current optimization and memory handling, warranting structured/low-memory surrogate methods (Beckman et al., 2022, James et al., 16 Jul 2025).
Identifiability and Prior Design: Decoupling of marginal variance, correlation length, and anisotropy remains delicate in spatially varying parameterizations. Structural priors (e.g., on Riemannian metrics or smoothness fields) and alternative model parameterizations are active research areas (Fuglstad et al., 2013, Lindgren et al., 2021).
Model Averaging and Complexity Penalization: Automated or trans-dimensional schemes remain relatively undeveloped compared to traditional model selection. Explicit regularization and Bayesian model-averaging are underrepresented despite their practical efficacy (Booth et al., 2023).
Non-Gaussian Generalizations: SPDE-based models for non-Gaussian fields (e.g., Lévy-driven, Student- $t$ ) and for handling heavy-tailed or rare-event regimes are advancing but remain less tractable than the Gaussian case (Lindgren et al., 2021).
Scalable Inference for Deep and Compositional Structures: Improvements in MCMC, random feature expansions, and structure-exploiting variational inference are central to further scalability and application breadth (Booth et al., 2023).