Conditional Emulation in Physics

Updated 29 March 2026

Conditional emulation is a technique using surrogate models to approximate complex physical systems with efficiency and robust uncertainty quantification.
It integrates data-driven and physics-informed machine learning methods, such as Gaussian processes, mixture density networks, and diffusion models.
Applications span climate modeling, turbulent dynamics, and quantum physics, enabling accelerated forward predictions and effective inverse problem solutions.

Conditional emulation in physics encompasses a range of methodologies by which surrogate models, typically grounded in either data-driven or physics-informed machine learning, approximate complex physical systems while allowing efficient, accurate, and uncertainty-quantified queries at arbitrary settings of system parameters or observed quantities. These emulators enable both forward predictions (e.g., simulation of system observables given parameters) and inverse problem solutions (e.g., inference of system states/parameters from data), and can be flexibly conditioned on context, external forcings, subfields, or observed data.

1. Foundational Principles and Formulations

Conditional emulation aims to replace or augment expensive, nonlinear, or otherwise intractable physics-based models with surrogate mappings that can be rapidly evaluated and are efficiently conditioned on input variables or observables. The classical paradigm posits a mapping $F : \mathbb{R}^m \to \mathbb{R}^n$ representing the true physics, and seeks a surrogate $\hat{F}(x)$ or more generally a conditional density $p(y|x)$ that is both accurate and supports meaningful uncertainty quantification.

Bayesian Inverse Problems and Posterior Conditioning

The archetypal setup is Bayesian inference, where one wishes to sample from the posterior $p_{X|Y}(x|y)$ , leveraging samples from the joint $p_{X,Y}$ without explicit evaluation of likelihoods or priors. Surrogate models are trained to amortize inference, so new queries can be conditioned on arbitrary $y$ with minimal additional computation (Dasgupta et al., 14 Mar 2026).

Physics-Informed Design

Many approaches incorporate known physics directly into the emulator, e.g., by embedding Taylor expansions (Jin et al., 2023), enforcing PDE-constrained loss functions (Han et al., 11 Feb 2026), or leveraging symmetries (e.g., spherical harmonics for global models (Cachay et al., 2024)).

2. Methodologies for Conditional Emulation

Conditional emulation frameworks in physics can be categorized according to the surrogate model class and the type of conditioning handled:

Gaussian Process (GP) and Co-Kriging Emulators

GP-based emulators, prevalent in Bayesian calibration (e.g., heavy-ion physics (Paquet, 2023); turbulent flow (Mak et al., 2016)), model $f(x)$ as a GP with mean $m(x)$ and kernel $k(x,x')$ , yielding closed-form conditional mean and variance at new inputs $x_*$ : $\mu_*(x_*) = K(X,x_*)^T [K(X,X) + \sigma_n^2 I]^{-1} y$ with covariance given similarly.

GPs naturally support conditioning on new data, analytic uncertainty quantification, and flexible kernel design, but scale poorly with data size ( $O(N^3)$ in $N$ samples) and face challenges in high-dimensional or multimodal regimes (Paquet, 2023, Mak et al., 2016).
Co-kriging extends these ideas to functional outputs by projecting high-dimensional simulation data onto reduced bases and modeling the basis coefficients as jointly Gaussian conditioned on geometric or physical parameters (Mak et al., 2016).

Physics-Informed Mixture Density Networks (MDN)

MDNs model $p(y|x)$ directly via explicit Gaussian mixtures, with mixture weights, means, and covariances parameterized as functions of $x$ , and integrate physics via regularization on the governing equation residuals: $p(y|x) = \sum_{k=1}^M \pi_k(x) \mathcal{N}(y \mid \mu_k(x), \Sigma_k(x))$ with distribution-level physics priors of the form

$\mathcal{L}_{\text{phys}} = \mathbb{E}_{x}\sum_{k} \pi_k(x) \left\| \mathcal{R}^{(k)}(x, \mu_k(x)) \right\|^2$

This enables explicit modeling of multimodal, regime-switching behaviors and interpretable physics constraint enforcement (Han et al., 11 Feb 2026).

Conditional Flow Matching (CFM) and Probability-Flow ODEs

CFM learns the velocity field of a probability-flow ODE that transports a source distribution $q_0$ (e.g., $\mathcal{N}(0,I)$ or data-informed) to the posterior $p_{X|Y}(x|y)$ . The key regression loss is

$L(\theta) = \frac{1}{2} \mathbb{E}_{t} \mathbb{E}_{(x,y),z} \left\| v_\theta(t, I_t(z,x), y) - \dot{I}_t(z, x) \right\|^2$

with $I_t(z,x) = (1-t) z + t x$ , $\dot{I}_t(z,x) = x - z$ . CFM is robust to generic nonlinearities and noise models, requires only samples, and can recover multimodal posteriors (Dasgupta et al., 14 Mar 2026).

Denoising Diffusion Probabilistic Models (DDPM) and Functional Diffusion

Diffusion-based emulators solve forward diffusion SDEs (or discrete Markov chains), then learn a reverse-time score network. Conditioning may be injected by concatenating context (e.g., monthly means (Bassetti et al., 2024), external forcings (Cachay et al., 2024), partial observables (Long et al., 2024)) at every step or block.

Arbitrarily-conditioned models (e.g., ACM-FD) employ random-masking during training to enable query-time selection of any subset of functions/fields for conditioning (Long et al., 2024).
DiffESM and related models extend DDPMs to global spatiotemporal fields, conditioning daily sequences on monthly-mean or other aggregate statistics (Bassetti et al., 2024, Bassetti et al., 2023).

Attention and Set-Conditional Models

For permutation-invariant data (e.g., unordered sets in high-energy physics), set-conditional emulation utilizes permutation-equivariant networks, graph neural nets for feature extraction, and slot-attention decoders for generating output sets, conditioned on input sets of arbitrary size and composition (Bello et al., 2022).

Reduced-Order Models and Eigenvector Continuation

Physics emulators for quantum and wave systems exploit variational functionals and reduced-basis expansions (e.g., eigenvector continuation), conditioning on parameters $\theta$ . Training uses solutions at a sparse set of parameter-space points and projects interpolatively or via rational approximation to new settings, with strong theoretical error control (Zhang et al., 2021, Zhang, 2024).

3. Conditioning Mechanisms and Training Strategies

Conditional Parameterization

Conditioning is realized through learnable attention mechanisms (e.g., softmax-weighted linearizations in PETAL (Jin et al., 2023)), concatenation or FiLM layers for context variables (e.g., source parameters, observables, time indices), or auxiliary classifier-free guidance in diffusion models (Nathaniel et al., 19 Apr 2025, Bassetti et al., 2023).

Table: Representative Conditioning Approaches

Model Type	Conditioning Mechanism	Example Paper
GP/Co-Kriging	Input features in kernel	(Mak et al., 2016)
MDN	Input $x$ in all component maps	(Han et al., 11 Feb 2026)
DDPM/Score Model	Context input/concat, FiLM, mask	(Long et al., 2024)
CFM	Observation $y$ as network input	(Dasgupta et al., 14 Mar 2026)
GAN	Latent vector norm, input concat	(Perraudin et al., 2020)
Set-Based GNN	Pooling and slot-attention	(Bello et al., 2022)

Training Losses and Regularization

Task-specific losses vary:

Negative log-likelihood or mean squared error for deterministic/probabilistic surrogates (Licata et al., 2022).
Physics-based regularization for enforcing PDE or conservation law consistency (Han et al., 11 Feb 2026).
Distribution-level score matching for diffusion/score-based models (Long et al., 2024).

Physics knowledge may be imposed as direct regularization, post-hoc correction, or via explicit architecture design (e.g., Spherical Fourier operators for global climate (Cachay et al., 2024)).

4. Uncertainty Quantification and Validation

Uncertainty quantification is central. GP/co-kriging models provide analytic posterior variance, with co-kriging capturing inter-field covariances (Mak et al., 2016). MDNs yield conditional variance and explicit multimodality in the predictive distribution (Han et al., 11 Feb 2026). Diffusion and flow-based models furnish sampled conditional distributions; ensembles are used to calibrate predictive mean and variance (Licata et al., 2022, Nathaniel et al., 19 Apr 2025).

Validation utilizes rigorous physics-aware metrics. For example:

Moment and extreme-event statistics (e.g., 90th percentile extremes (Bassetti et al., 2023, Bassetti et al., 2024)).
Kolmogorov–Smirnov tests, autocorrelation, and spectral error (Nathaniel et al., 19 Apr 2025).
Event-level similarity (Wasserstein distances, MSSSIM, MMD between sets) (Perraudin et al., 2020, Bello et al., 2022).

5. Applications Across Physics Domains

Conditional emulators have been demonstrated in a wide spectrum of physics domains:

Inverse Problems: PETAL achieves accurate recovery of ocean sound speed profiles from acoustic travel times by blending physics-grounded local linearizations (Jin et al., 2023); conditional flow matching solves high-dimensional Bayesian inverse problems for nonlinear and non-differentiable forward models (Dasgupta et al., 14 Mar 2026).
Climate and Weather: Diffusion models (DiffESM, DYffusion) emulate daily climate fields or 6-h global climate states conditioned on low-cost model output or initial/forcing states, yielding extreme-event probabilities and centennial time-scale ensemble skill (Cachay et al., 2024, Bassetti et al., 2024, Bassetti et al., 2023, Long et al., 2024).
Turbulent and Chaotic Dynamics: Cohesion integrates deterministic reduced-order modeling (deep Koopman operators) as a prior over coarse structure, guiding conditional diffusion for stable and physically consistent long-horizon forecasts in flows and geophysical turbulence (Nathaniel et al., 19 Apr 2025).
Quantum and Continuum Physics: Variational reduced-basis emulators and non-Hermitian projections enable ultra-fast, accurate emulation of scattering and continuum observables over parameter or energy grids (Zhang et al., 2021, Zhang, 2024).
High-Energy Physics: Set-conditional emulation and implicit quantile models dramatically accelerate event simulation, e.g., in LHC jet and particle reconstruction (Kronheim et al., 2023, Bello et al., 2022).
Cosmology: Conditional GANs allow instantaneous synthesis of weak-lensing mass maps at arbitrary cosmological parameters, capturing both mean and higher-order statistical structure (Perraudin et al., 2020).
Complex Multiphysics Systems: ACM-FD handles forward, inverse, and arbitrary conditioning queries across coupled PDE systems—Darcy flow, reaction-convection, torus vorticity—by random-masked diffusion in GP function space (Long et al., 2024).

6. Limitations and Future Directions

Current methods exhibit several limitations:

GPs and co-kriging emulators face scalability challenges and may underrepresent non-Gaussian or multimodal features (Mak et al., 2016, Paquet, 2023).
Conditional flow matching may suffer from variance collapse or selective memorization with finite data; early stopping is essential (Dasgupta et al., 14 Mar 2026).
Physics-informed MDNs require judicious choice of component regularization and may only approximate skewness or heavy tails with Gaussian components (Han et al., 11 Feb 2026).
Diffusion models for high-resolution or long-horizon fields require careful architecture scaling (e.g., efficient Kronecker product structures, progressive distillation (Long et al., 2024)).
Real-world deployment, particularly in the presence of model discrepancy, sensor noise, and data scarcity, remains an area of active investigation (Jin et al., 2023, Bassetti et al., 2024).

Emerging topics include joint emulation across variable groups, integration of multifidelity simulation, adaptive reference/conditioning selection (Jin et al., 2023), efficient uncertainty quantification, and theoretical development of optimal reduced-basis selection in continuum models (Zhang, 2024).

7. Summary and Outlook

Conditional emulation in physics has progressed from GP-based surrogates and reduced-basis approaches to a diverse repertoire of deep generative and operator-based models. These frameworks exploit conditioning—on system parameters, measurements, or partial observables—either through architecture design (attention, masking, pooling), loss-based regularization, or composition with fast physics-inspired priors. Common to all is the shift from deterministic or point-estimate prediction to flexible, uncertainty-aware, and often stochastic conditional generation that respects the underlying physics. The resulting tools deliver dramatic acceleration and scalability for simulation, calibration, and inference across physics, with ongoing innovation toward more expressivity, generality, and physical interpretability (Jin et al., 2023, Long et al., 2024, Dasgupta et al., 14 Mar 2026, Bassetti et al., 2024, Cachay et al., 2024, Han et al., 11 Feb 2026, Nathaniel et al., 19 Apr 2025, 1602.10451).