Gaussian Processes in Approximate Bayesian Computation

Updated 29 November 2025

Gaussian Processes are nonparametric priors that, when combined with ABC, model discrepancies and synthetic likelihoods to efficiently approximate intractable likelihoods.
The integration of GP surrogates supports uncertainty quantification and adaptive design, significantly reducing the number of expensive simulator evaluations in applications like dynamical systems and genetics.
Practical guidelines emphasize careful kernel selection, hyperparameter tuning, and monitoring surrogate performance to maintain reliable posterior approximations in ABC frameworks.

Gaussian Processes (GPs) combined with Approximate Bayesian Computation (ABC) constitute an advanced suite of methodologies for likelihood-free Bayesian inference, targeting simulation-based models where explicit evaluation of the likelihood $p(y \mid \theta)$ is either intractable or computationally prohibitive. By constructing surrogate models for key quantities such as the discrepancy, likelihood, or summary statistics, GPs enable dramatic reductions in simulator calls, provide uncertainty quantification, and facilitate automatic acquisition and decision strategies in ABC pipelines.

1. Fundamentals of Gaussian Processes in ABC

A Gaussian Process is a nonparametric prior over functions, characterized by mean $m(\theta)$ and kernel $k(\theta, \theta')$ . Within ABC, GPs are most commonly deployed in three ways:

Discrepancy Surrogates: Modeling the distance between simulator outputs and observed data, $\Delta(\theta) = d(S(\text{sim}(\theta)), S_{obs})$ , as a GP, allows for probabilistic predictions of ABC acceptance and rejection (Järvenpää et al., 2016, Järvenpää et al., 2019, Cao et al., 13 Apr 2024).
Synthetic Likelihood Surrogates: Modeling the unknown log-likelihood $\ell(\theta) = \log \Pr(d(s(\text{sim}(\theta)), s(y)) \le \varepsilon \mid \theta)$ via a GP for accelerated ABC-MCMC (Wilkinson, 2014, Meeds et al., 2014, Järvenpää et al., 2021).
Trajectory and Derivative Modeling: In ODE/DDE parameter inference, a GP fitted to noisy trajectories permits direct matching of empirical derivatives to instantaneous model vector fields, bypassing costly numerical integration (Ghosh et al., 2015).

The ABC posterior, with a kernel threshold $\varepsilon$ , is typically expressed as $\pi_{ABC}(\theta) \propto \pi(\theta) \Pr[\Delta(\theta) \le \varepsilon]$ , where the probability term can be efficiently approximated or analytically integrated via the GP surrogate.

2. GP Surrogates for Discrepancy and Synthetic Likelihood

Discrepancy Modeling

Given observations $\{(\theta_i, \delta_i)\}$ , the GP provides for any new input $\theta^*$ :

$\delta(\theta^*) \sim \mathcal{N}(\mu^*(\theta^*), \sigma^{2,*}(\theta^*))$

where predictive mean and variance use standard GP regression formulas.

The acceptance probability under the GP is (Järvenpää et al., 2016):

$U(\theta^*) = \Phi\left(\frac{\varepsilon - \mu^*(\theta^*)}{\sqrt{\sigma^{2,*}(\theta^*) + \sigma_n^2}}\right)$

This enables either weighted prior sampling or analytical calculation of the model-based ABC posterior $\pi_{ABC}(\theta) \propto \pi(\theta) U(\theta)$ (Järvenpää et al., 2019, Järvenpää et al., 2017).

Synthetic Log-Likelihood GP

For the likelihood-free case, surrogates for summary statistics or log-likelihood allow direct emulation of the MH accept step (Wilkinson, 2014, Järvenpää et al., 2021, Meeds et al., 2014):

Place a GP prior on the simulated summary statistics or directly on the log-likelihood.
Use GP posterior variance to construct acceptance probabilities or error-aware thresholds.
Implement sequential or batch strategies to update the GP, targeting acquisition of informative simulation evaluations.

Derivative Matching in DE Models

By fitting a GP to observed time-series data, the empirical derivatives $v^d(t)$ are inferred; model vector fields $f(\hat{x}(t), \theta)$ are matched in ABC by a “derivative-space distance”:

$\Delta(v^d, v^s) = \sum_{i=1}^L \| v^d(t_i) - f(\hat{x}(t_i), \theta) \|^2$

This obviates trajectory simulation for each candidate $\theta$ , yielding orders-of-magnitude computational gains (Ghosh et al., 2015).

3. Acquisition, Batch Design, and Early Rejection Strategies

Sequential and Batch Acquisition Rules

Efficient experimental design, exploiting GP surrogate uncertainty, is critical:

Expected Integrated Variance (EIV): Select new $\theta$ to minimize posterior variance of the ABC estimator (Järvenpää et al., 2017, Järvenpää et al., 2019).
Batch Bayesian Experimental Design: Parallelize expensive simulator evaluations via greedy construction or loss minimization for batches $\{\theta^*_1, ..., \theta^*_b\}$ , enabling linear wall-time speedup (Järvenpää et al., 2019).
Early Rejection ejMCMC: GP-predicted quantile bounds $h_a(\theta)$ on the discrepancy enable up-front rejection of hopeless proposals, yielding 20–80% fewer simulator calls versus standard ABC-MCMC (Cao et al., 13 Apr 2024).

for iteration in range(N):
    propose theta_star ~ q(. | theta_n)
    stage 1: if GP_predicted_lower_quantile(Delta(theta_star)) > threshold:
        early reject
    else:
        simulate x_star ~ p(. | theta_star)
        standard ABC-MCMC accept/reject

Adaptive Confidence and Error Control in ABC-MCMC

GPS-ABC uses Monte Carlo samples from the GP posterior to construct acceptance probabilities, proceeding only when estimated error $E \le \xi$ is achieved (Meeds et al., 2014).
Experimental design is adaptive, with GP hyperparameters tuned upon addition of new simulation data.

4. Kernel Selection and Model Comparison

GP kernel choice critically impacts reconstruction fidelity. Zhang et al. (Zhang et al., 2023) systematically compare RBF, Cauchy, and Matérn ( $\nu = 5/2, 7/2, 9/2$ ) kernels in cosmological GP reconstruction under two inference schemes:

ABC Rejection gave moderate to strong Bayes-factor preference for Matérn 5/2 over RBF across datasets (CC, SNIa, GRB), but nested sampling of the exact log-marginal likelihood showed inconclusive evidence and sometimes reversed the ranking.
Interpretation: ABC-based selection is sensitive to the chosen discrepancy metric, potentially exaggerating differences; fully marginalized model evidences may be less discriminative.

Kernel selection strategies extend to model-based GP surrogate ABC: cross-validation and expected utility maximization are used to automate kernel choice (Järvenpää et al., 2016).

5. Applications: Differential Equations, Genetics, Cosmology, Densities

Parameter Inference in Dynamical Models

GP-ABC-SMC achieves reliable ODE/DDE parameter posteriors in <30 s, versus 10²–10⁶ s for standard ABC, with only minor loss of credible interval precision (Ghosh et al., 2015).
Biochemical network, delay blowfly, and gene-transfer models demonstrate high simulation savings, accurate marginal recovery, and robust posterior-predictive checks (Ghosh et al., 2015, Cao et al., 13 Apr 2024, Järvenpää et al., 2016).

Hierarchical Density Modeling via GP Priors

Functional regression ABC with hierarchical GP priors allows nonparametric estimation of grouped densities, learning shrinkage relations and correcting KDE bias robustly (Rodrigues et al., 2014).

Cosmological and Population-Genetic Inference

ABC-GP surrogates reconstruct cosmic histories from chronometer, SNIa, and GRB datasets, allowing rigorous comparison of kernel choices for GP reconstructions (Zhang et al., 2023). In genetics, GP surrogates accelerate ABC on species divergence and horizontal gene transfer scenarios, yielding 10–100× simulation savings (Wilkinson, 2014, Järvenpää et al., 2016).

6. Computational Performance, Scalability, and Limitations

GP surrogate ABC methods scale cubically in the number of training points (O( $t^3$ )) but this overhead is negligible relative to expensive simulator calls when $t \lesssim 1000$ . Batch acquisition and parallel evaluation further reduce wall-clock times (Järvenpää et al., 2019). Limitations include:

Breakdown for highly multimodal or discontinuous response surfaces (Wilkinson, 2014, Järvenpää et al., 2016).
Need for careful monitoring of surrogate fit and acquisition function efficacy.
Failure to recover multimodal ABC posteriors when relying entirely on GP surrogates (see bimodal examples in (Cao et al., 13 Apr 2024)).
Hyperparameter uncertainty is typically ignored—uncertainty bands may be under-covering (Järvenpää et al., 2019).

7. Theoretical Guarantees and Practical Guidelines

Detailed balance and posterior consistency are maintained in GP-augmented Metropolis-Hastings schemes under regularity conditions (Järvenpää et al., 2021, Cao et al., 13 Apr 2024, Meeds et al., 2014). Total variation error in stationary distributions can be controlled via per-step error thresholds (Meeds et al., 2014).

Practical guidelines include:

Dimensionality reduction in summary statistics to $q \lesssim 10$ for GP tractability (Meeds et al., 2014, Rodrigues et al., 2014).
Use of variance-based versus quantile-based acquisition functions depending on the uncertainty learning objective (Järvenpää et al., 2017, Järvenpää et al., 2019).
Regular hyperparameter updates and use of multi-output GP formulations for correlated summaries (Meeds et al., 2014).
Cross-validation and utility maximization for kernel and surrogate model selection (Järvenpää et al., 2016, Zhang et al., 2023).