Bayesian Uncertainty Quantification via Stochastic Sampling
- Bayesian uncertainty quantification is a framework that rigorously distinguishes between epistemic (parametric) and aleatory (intrinsic) uncertainty in stochastic models.
- It integrates dimensionality reduction via Karhunen–Loève, non-intrusive polynomial chaos expansion, and Bayesian Gaussian process regression to build efficient surrogates.
- Applications, such as stochastic epidemic models, demonstrate its ability to accurately capture risk profiles with substantial computational efficiency over traditional Monte Carlo methods.
Bayesian uncertainty quantification from stochastic sampling is a paradigm for rigorously separating, propagating, and analyzing uncertainty in computational models whose outputs depend on both unknown inputs (parametric/epistemic uncertainty) and intrinsic randomness (aleatory uncertainty). This approach is especially salient in the context of stochastic simulation, where random sampling from uncertain parameters and process noise jointly determine the variability of quantities of interest. The modern Bayesian framework leverages posterior probability distributions, multidimensional decompositions, and surrogate modeling via Gaussian processes and related methods to efficiently characterize uncertainty while distinguishing its sources.
1. Separation of Intrinsic and Parametric Uncertainty
Quantitative prediction of outputs from stochastic dynamical models necessitates an explicit distinction between uncertainty introduced by latent parameters and that resulting from stochastic evolution of the system (process noise). The Karhunen–Loève (KL) decomposition is deployed as an initial stage to perform a principled reduction of output dimensionality, yielding a compact orthogonal basis for the simulated random field . Projected coefficients subsequently encode all variability.
To model the distributional characteristics of (which may be non-Gaussian), a non-intrusive polynomial chaos (PC) expansion is employed:
where are orthogonal basis polynomials in a standard normal latent variable , and are PC coefficients encoding the parametric dependency. This structure explicitly separates sources of uncertainty: randomness sampled by (intrinsic stochasticity) and parametric uncertainty carried by .
The expansion coefficients are obtained by evaluating simulation outputs at sampled parameters and realization of randomness, then projecting onto the PC basis via Monte Carlo integration (with variable transformations such as Rosenblatt transform to reconcile probability measures).
2. Bayesian Emulation with Gaussian Processes
The dependence of the PC coefficients on uncertain input parameters is learned using a Bayesian Gaussian process (GP) regression surrogate. Each coefficient is viewed as a function over parameter space, emulated via:
- Expressing standardized coefficients as a truncated principal component decomposition
- Modeling each principal component using an independent GP:
where are singular vectors, are zero-mean GP random processes with covariance hyperparameters , and is a white noise GP.
Hierarchical Bayesian priors (Gamma and Beta for process lengthscales and variances) are placed on the hyperparameters, which are then sampled using Metropolis–Hastings MCMC. This enables rigorous propagation of parametric uncertainty through the surrogate: uncertainty is high in regions of -space with sparse sampling, as reflected in the GP predictive variance.
The full surrogate recapitulates the stochastic model output as:
with PC coefficients interpolated Bayesianly across the input domain.
3. Stochastic Sampling Methodology
Stochastic sampling supports both parameter-space exploration (to characterize epistemic uncertainty) and process noise (to quantify aleatory uncertainty). In practical workflows:
- Parameters are sampled from prescribed distributions (e.g., lognormals reflecting domain-specific uncertainty).
- For each fixed , multiple stochastic realizations are generated to obtain an empirical distribution over outputs .
- Kernel density estimation produces empirical PDFs for quantities of interest at fixed parameters, facilitating statistical comparison and validation.
The technical challenge in combining KL and PC expansions with disparate stochastic and parametric spaces is addressed by pushing all variables into a common frame via transformations and then using Monte Carlo integration to project simulation data onto the PC basis.
4. Application to Stochastic Epidemic Models
The surrogate construction is illustrated by application to a stochastic SIR (Susceptible–Infected–Recovered) epidemic model with uncertain infection () and recovery () rates. Outputs of interest (e.g., peak infection, timing, duration above threshold, cumulative infections) are extracted from each realization.
The surrogate accurately reproduces the joint distributions, non-Gaussian features, and correlational structures found in direct simulation ensembles, as evidenced by side-by-side quantitative and visual agreement in kernel density plots. This demonstrates the methodology's ability to disentangle and jointly model intrinsic variability and parametric uncertainty, providing high-resolution risk profiles for epidemic forecasts.
5. Computational Efficiency and Comparative Perspective
Compared with traditional Monte Carlo, which naively requires a prohibitive number of forward simulation runs to adequately resolve full uncertainty, the KL–PC–GP surrogate provides marked computational savings:
- Each type of uncertainty is handled efficiently: GP regression interpolates parametric uncertainty, while PC basis captures non-Gaussian output distributions.
- Once constructed, the surrogate is rapidly sampled to generate output predictions under arbitrary parameter scenarios.
- Calibration (truncation levels , kernel bandwidths, GP hyperparameters) demands careful supervision and can become resource-intensive in high dimensions or with complex models.
The primary limitations are the surrogate's dependence on sample quality, its sensitivity to tuning parameters, and residual approximation error in higher-order moments. Rigorous theoretical guarantees for convergence and error rates remain only partially developed.
6. Limitations and Directions for Future Work
Open challenges include:
- Determining reliable heuristics and theoretical results for convergence (number of samples needed for accurate mean, variance, higher moments of outputs).
- Automating the selection of truncation levels in KL/PC expansions and optimal bandwidth in kernel density estimates.
- Extending the methodology to nonparametric or hierarchical modelling settings for cases with even more complex or structured sources of uncertainty.
The development of surrogates that combine flexible statistical learning, dimensionality reduction, and principled Bayesian inference establishes a template for efficient and interpretable uncertainty quantification in stochastic computational models with mixed uncertainties.
Summary Table: Methodological Components
Component | Uncertainty Addressed | Key Technique |
---|---|---|
Karhunen–Loève | Intrinsic + Parametric | Orthogonal decomposition, truncation |
Polynomial Chaos | Intrinsic (aleatory) | Non-intrusive expansion, PC projection |
Bayesian GP | Parametric (epistemic) | Regression with MCMC hyperparameters |
This multidimensional Bayesian uncertainty quantification framework synthesizes dimensionality reduction, non-intrusive stochastic expansion, and surrogate learning to yield computationally tractable and robust separation of uncertainty sources in complex stochastic systems (Hickmann et al., 2015).