Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 56 tok/s

Gemini 2.5 Pro 39 tok/s Pro

GPT-5 Medium 15 tok/s Pro

GPT-5 High 16 tok/s Pro

GPT-4o 99 tok/s Pro

Kimi K2 155 tok/s Pro

GPT OSS 120B 476 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

Sequential Bayesian Design for Locally Accurate Surrogates

Updated 25 July 2025

The paper introduces an adaptive sequential strategy that focuses surrogate model accuracy in high-probability posterior regions to reduce computational costs.
It employs coarse initialization and iterative retraining using informative samples from updated posterior distributions.
Empirical results on PDE-constrained inverse problems demonstrate significant reductions in expensive model evaluations and improved estimation accuracy.

Sequential Bayesian Design for Locally Accurate Surrogate (SBD-LAS) refers to a class of methodologies for constructing surrogate models that accurately approximate the response of complex, expensive-to-evaluate simulators, specifically in regions of the input space where the true likelihood or posterior is concentrated. SBD-LAS aims to dramatically reduce the computational cost of Bayesian inference or optimization by focusing surrogate model accuracy where it is needed most—typically around the high-probability region for the parameter posterior in an inverse problem—rather than investing significant modeling effort to achieve global accuracy. This strategy leverages adaptive, sequential experimental design, updating the surrogate and the posterior iteratively as new, informative samples are selected. The approach is particularly suited for high-dimensional, computationally intensive problems such as PDE-constrained inverse problems, where globally accurate surrogates are generally infeasible with limited data and computational resources (Wang et al., 23 Jul 2025).

1. Problem Setting and Motivation

In many scientific and engineering contexts, inverse problems governed by partial differential equations (PDEs), complex physical simulators, or stochastic models require solving for unknown parameters given observed data. Bayesian methods provide a principled framework for such inference, yielding a posterior distribution over parameters via

$\pi(\theta|y) \propto l(\theta|y) \cdot \pi_{pri}(\theta).$

The computation of the likelihood $l(\theta|y)$ typically involves running an expensive forward model $G(\theta)$ (e.g., a high-fidelity PDE solver). The computational bottleneck arises because both global surrogate modeling (accurate across the entire input space) and direct posterior sampling (e.g., via MCMC, often requiring millions of model runs) are intractable in high dimensions or with limited computational budgets.

The motivation behind SBD-LAS is to construct a surrogate $\tilde{G}$ that is only required to be accurate in regions where the posterior is non-negligible—i.e., for values of $\theta$ with high $l(\theta|y) \cdot \pi_{pri}(\theta)$ . This local focus drastically reduces the amount of required data and model complexity, thus enabling efficient, accurate inference and decision-making (Wang et al., 23 Jul 2025).

2. Locally Accurate Surrogate Modeling

The surrogate in SBD-LAS is a function $\tilde{G}: \mathbb{R}^p \to \mathbb{R}^d$ , trained only on samples $\theta_i$ drawn from the high-probability posterior region. The surrogate likelihood is then modeled as

$\tilde{l}(\theta | y; \tilde{G}) \propto \exp\left\{ -\frac{1}{2} \| y - \tilde{G}(\theta) \|_{\Sigma_\eta}^2 \right\},$

where $\Sigma_\eta$ is the (possibly noise-inflated) data covariance. The initially unknown high-probability region is discovered adaptively by leveraging the posterior samples from previous design stages.

Surrogate construction typically proceeds as follows:

Start with a coarse (cheap) solver for $G(\theta)$ to cover the prior broadly.
Augment or correct the coarse predictions using a data-driven model (neural network, operator net, etc.) trained on data from the posterior region.
Retrain the surrogate at each design stage using informative points from the updated posterior, ensuring surrogate accuracy in regions where it impacts the inference.

This process allows for use of lower model complexity and smaller training data sets than would be necessary for global accuracy, leading to computational efficiency and model adaptivity.

3. Sequential Bayesian Design Strategy

Since the high-probability region of the likelihood is not known at the outset, SBD-LAS employs a sequential experimental design (adaptive sampling) strategy:

Initialization: Use the prior or a coarse approximation to propose initial training points and construct the first surrogate.
Posterior Update: Given the surrogate likelihood, compute the approximate posterior

$\pi_{\text{post}}^{(k)}(\theta | y) \propto \tilde{l}^{(k)}(\theta) \cdot \pi_{\text{pri}}^{(k)}(\theta),$

where $k$ denotes the iteration.

Prior Transfer and Predictive Acceleration: For the next iteration, set the prior to be a Gaussian approximation of the current posterior, possibly using a “one-step ahead” linear prediction:

$\phi_{k+1} = \phi_k + \alpha \cdot (\phi_k - \phi_{k-1}),$

where $\phi = [\mu_k, \Sigma_k]$ are the mean and covariance of the posterior samples, and $\alpha \geq 0$ is a step-size parameter.

Resampling and Retaining Efficiency: Draw new training points from the updated prior and retrain the surrogate.
Termination: Repeat this procedure until convergence (e.g., posterior mean/covariance stabilize, or a desired surrogate accuracy is met).

This strategy ensures that surrogate refinement and posterior exploration concentrate computational effort on regions where the surrogate’s accuracy has the greatest impact on the Bayesian inference.

4. Algorithmic Framework

The SBD-LAS algorithm can be summarized in the following steps (Wang et al., 23 Jul 2025):

Coarse Initialization: Initialize with samples from a coarse solver, forming the first prior and surrogate.
Iterative Loop:
- Use the current surrogate to compute the posterior.
- Approximate the posterior with a Gaussian and predict the one-step ahead prior.
- Resample points from the new prior, retrain the surrogate using the new data.
- Continue until stopping criteria are met (e.g., small improvement between iterations).
Final Inference: Use MCMC or another sampling method with the final locally accurate surrogate to estimate the posterior.

This iterative update, utilizing predictive prior acceleration when suitable ( $\alpha > 0$ ), was found to speed up convergence in experiments, especially as the posterior contracts.

5. Empirical Performance and Demonstrations

SBD-LAS was demonstrated on inverse problems involving the Darcy flow equation, a prototypical PDE-constrained problem in fluid mechanics:

Complicated coefficient field: For a permeability field parameterized over a $10 \times 10$ grid, SBD-LAS achieved lower inversion error (mean squared error) compared to coarse-solver or fine-solver-only approaches, with two orders of magnitude fewer calls to the fine solver.
Multi-peak fields and interface problems: The method accurately recovers high-frequency features and sharp interfaces, with the predictive acceleration strategy ( $\alpha = 0.1$ or $0.5$) leading to faster convergence than simpler updates.
High dimensionality and noise robustness: With dimensions up to 400 and realistic noise, SBD-LAS still achieved competitive inversion results at a fraction of the computational cost.

These results demonstrate that SBD-LAS is able to judiciously allocate computational resources, focusing high-fidelity simulations where they are most beneficial, and thus enabling accurate solution of inverse problems previously deemed computationally intractable.

6. Applications, Extensions, and Implications

SBD-LAS is applicable to a wide range of Bayesian inversion problems governed by expensive forward models:

Scientific computing: Groundwater modeling, reservoir engineering, geoscience, and contaminant transport where Darcy-type PDEs are fundamental.
Medical imaging: Applications where data acquisition is expensive and model evaluations are slow (e.g., MRI, tomography).
Other PDE-constrained inverse problems: Anywhere Bayesian inference is used with computationally intensive simulation.

The method integrates naturally with MCMC and variational inference algorithms, and can leverage diverse types of surrogates (e.g., neural operators, deep ONets). Its use of iterative, local surrogate refinement and sequential Bayesian design provides a template for efficient, scalable Bayesian inference in high-dimensional and tightly constrained domains.

A key implication of SBD-LAS is the practical feasibility of Bayesian inversion with modest computational budgets via intelligent allocation of simulation and model complexity. Because updates use information-targeted sampling in the input space, the method is robust to the curse of dimensionality and the need for global model expressivity is largely bypassed. This has significant potential for extending fully Bayesian approaches to settings that were previously only tractable under restrictive, simplified models or with limited uncertainty quantification.

Summary Table: SBD-LAS Algorithmic Steps

Step	Description
Initialization	Train coarse surrogate, set prior from coarse solver
Posterior Update	Compute posterior with current surrogate
Prior Update	Transfer posterior or apply one-step ahead Gaussian prediction
Sampling	Draw new training points from updated prior
Surrogate Update	Retrain surrogate model locally
Iteration	Repeat until convergence
Final Inference	Use surrogate for full Bayesian inversion (e.g., MCMC)

PDF Markdown Chat (Pro)

References (1)

Sequential Bayesian Design for Efficient Surrogate Construction in the Inversion of Darcy Flows (2025)