Local Surrogate Modeling

Updated 29 December 2025

Local surrogate modeling is the construction of region-specific approximations to expensive models by partitioning the input space and adapting to local features.
Key methodologies include piecewise polynomial surrogates, local Gaussian processes, and domain decomposition, with refinements driven by p-, h-, and level moves.
Applications in Bayesian inversion, simulation optimization, and uncertainty quantification demonstrate significant reductions in computational cost and improved model fidelity.

Local surrogate modeling refers to the construction and deployment of computationally efficient, regionally adapted approximations to expensive or complex response surfaces. Unlike global surrogates, which attempt to fit the entire input domain with a single model, local surrogates generate or select models that are specifically tailored to restricted neighborhoods of the parameter or input space. This approach is especially effective when the system response is highly nonlinear, multi-regime, non-stationary, or when model evaluations are costly or noisy. Local surrogate modeling underpins advances in adaptive sampling, simulation optimization, stochastic inversion, multi-fidelity emulation, sensitivity analysis, and explainable machine learning.

1. Mathematical Formulation and Rationale

The general paradigm involves dividing the input (parameter) space $\Theta \subset \mathbb{R}^d$ into a collection of (possibly overlapping) subdomains ${\mathcal{R}_i}$ . Within each subdomain, a local surrogate $S_i(\theta)$ is constructed to emulate the target model $Q(\theta)$ or a quantity of interest map $f(\theta)$ in that region. Formally, the surrogate predictor at an input $\theta$ is given by

$\widehat{Q}(\theta) = \begin{cases} S_i(\theta) & \text{if } \theta \in \mathcal{R}_i \ \text{blend}(S_{i_1},...,S_{i_K}) & \text{if } \theta \in \bigcap_{k=1}^K \mathcal{R}_{i_k} \end{cases}$

The advantages of locality include enhanced adaptivity to non-stationarities, reduced computational overhead via smaller (and thus less expensive) surrogates, inherent parallelism, and the ability to strategically refine surrogates only where needed (e.g., near uncertainty-concentration regions). In high-dimensional Bayesian inversion, physics-based UQ, or simulation optimization, local surrogates directly reduce the number of expensive forward/adjoin model solves required (Mattis et al., 2018, Hong et al., 2021).

Key model classes for local surrogates are:

Piecewise Polynomial Surrogates: Local Taylor expansions, such as piecewise-constant or piecewise-linear approximations over Voronoi (or Delaunay) tessellations. These are frequently used in Bayesian inverse problems to create $S_0(\theta)$ and its locally error-corrected $S_1(\theta)$ version, with local adjoint solves used for value and gradient correction (Mattis et al., 2018).
Local Gaussian Processes: GP regression built on neighborhood data, with common selection strategies including nearest neighbors (LAGP), active-learning-Cohn, or adaptive sampling. Inducing-point-based local GPs (LIGP) reduce the effective cost, enable input-dependent heteroskedastic noise modeling, and are particularly advantageous when stochastic variance structure is input-dependent (Cole et al., 2021).
Local Basis Models: Linear or nonlinear basis-expansion models (e.g., quadratic, spline, radial basis function surrogates) built by weighting local data around expansion centers (Hong et al., 2021, Olucha et al., 31 Mar 2025).
Domain Decomposition and Reduced-Order Models (ROMs): For PDEs, strategies such as overlapping domain decomposition combined with Proper Generalized Decomposition (PGD) permit physically meaningful reduction. Local surrogates are constructed per subdomain with precomputed PGD modes tailored to local interface configurations (Discacciati et al., 12 Sep 2024, Discacciati et al., 2 Aug 2025).
Tree-Based Surrogates and Mixture Models: Adaptive partitioning of the input space (e.g., regression trees, GMM-based clusters) governs where local polynomial, PCE, or basis surrogates are fit, accommodating discontinuities and strong local nonlinearity (Said et al., 16 Sep 2025, Dupuis et al., 2019, Dupuis et al., 2019).

Pivotal in such frameworks are error indicators, adaptive refinement criteria, or active learning cycles. Three fundamental refinement moves emerge (Mattis et al., 2018):

$p$ -refinement: Enriching the surrogate's basis locally.
Level-refinement: Increasing local model fidelity (e.g., mesh refinement, adjoint resolution).
$h$ -refinement: Inserting new sample locations optimally, usually via error or variance-driven criteria.

The table below crystallizes typical surrogate construction strategies:

Modeling Class	Local Domain Definition	Surrogate Type
Polynomial/Taylor	Voronoi, balls, clusters	Piecewise expansion
Local GP/LIGP	Nearest neighbors, trust	Weighted GP, sparse or full
PGD-ROMS	Subdomains, interfaces	PGD parameter-separated modes
Tree/PCE	Axis-aligned subcubes	Local polynomial chaos

3. Algorithmic Workflows and Computational Aspects

Local surrogate modeling is inherently algorithmic, driven by cycles of construction, error estimation, and refinement. The canonical workflow in high-fidelity inversion or uncertainty quantification involves:

Initialization: Select an initial design/sample set, partition input space, build initial surrogates.
Adjoint and Error Estimation: Solve adjoint problems to estimate local errors and gradients (for PDE models or statistics).
Local Error Assessment: Compute local error indicators $\eta_i$ per surrogate domain; these may combine function and gradient mismatches with local probability/posterior weights (Mattis et al., 2018).
Adaptive Step: Use marking strategies (e.g., $\eta_i > \alpha \max_j \eta_j$ ) to select where and how to refine—via $p$ , level, or $h$ steps.
Reconstruction and Iteration: Update local surrogates or add new sample points; solve forward/adjoint again only as required; reiterate until global error or accuracy tolerance is met.

Computational costs are typically dominated by forward/adjoint high-fidelity solves and (when applicable) covariance decompositions or kernel inversions, but local surrogates substantially reduce these counts. For example, in model Bayesian inversion for a 2D PDE with an 8-dimensional parameter space, adaptive local surrogates reduce high-fidelity solves from $10^5$ to $<2000$ for $10^{-3}$ -accuracy prediction (Mattis et al., 2018). PGD-based local surrogates demonstrably yield offline speed-ups of up to $100\times$ and online speed-ups of $800\times$ relative to full DD-FEM (Discacciati et al., 2 Aug 2025).

4. Applications and Practical Impact

Local surrogate modeling is deployed across a range of scientific, engineering, and data-driven contexts:

Stochastic Inverse Problems and UQ: In Bayesian inference for PDEs/ODEs and forward UQ, local surrogates enable efficient evaluation of integrals over high-dimensional parameter spaces at posterior modes or tails (Mattis et al., 2018).
Simulation Optimization: Trust-region frameworks (e.g., STRONG) utilize local surrogates for iteratively optimizing expected value objectives with noisy, expensive simulators, provably converging to stationary points (Hong et al., 2021).
Aerodynamics and Flow Simulation: Decomposition-based local surrogates (e.g., Local Decomposition Method) accurately emulate strongly multi-regime phenomena (subsonic, transonic), yielding $20$– $60\%$ error reduction vs. global models on industrial flows (Dupuis et al., 2019, Dupuis et al., 2019).
High-Dimensional Stochastic Simulators: Replication- and inducing-point-aware LIGP surrogates scale to tens of thousands of runs with full heteroskedastic uncertainty quantification, enabling robust emulation under strong input-dependent noise (Cole et al., 2021).
Atomic Structure Search: In atomistic optimization, local surrogates with smooth overlap descriptors and sparse kernel learning accelerate structure discovery, enable transfer learning across stoichiometries, and offer significant savings in expensive DFT evaluations (Rønne et al., 2022).
Explainable AI and Model Compression: Partition-based surrogates (e.g., SLIM, Tree-PCE) allow locally interpretable approximations to complex ML surfaces, providing region-specific variable importance, effect plots, and interaction diagnostics (Hu et al., 2020, Said et al., 16 Sep 2025).

5. Error Control, Sensitivity, and Theoretical Guarantees

Rigorous error estimation and refinement are central to the efficacy of local surrogates. Adjoint-based local error indicators measure discrepancies between the surrogate and truth in both function values and gradients, weighted by domain-appropriate measures (e.g., $L^2$ over Voronoi cells or posterior-mass-weighted norms) (Mattis et al., 2018). For GP-based local surrogates, predictive error and uncertainty are directly quantifiable via the posterior covariance structure, with input-dependent noise (nugget) fully handled in the LIGP formulation (Cole et al., 2021).

Sensitivity analysis is naturally local: Tree-PCE surrogates allow closed-form decomposition of predictive variance into local (per-leaf) Sobol' indices as well as global indices (TreePE) derived from splitting statistics (Said et al., 16 Sep 2025). For local surrogates derived in multi-source transfer learning, latent field GPs (LOL-GP) adaptively switch transfer on/off to avoid negative transfer, providing local error control per region in the target input space (Wang et al., 16 Oct 2024).

Convergence results have been established for local surrogate optimization (e.g., STRONG/SPAS), where every limit point of the sequence of iterates is a stationary point of the true objective under regularity conditions on $f$ and the surrogate accuracy tests (Hong et al., 2021).

6. Limitations, Extensions, and Future Directions

Local surrogate modeling, while highly effective in regimes with strong non-stationarities and expensive evaluations, exhibits certain inherent limitations:

Curse of Dimensionality: Purely local/non-hierarchical surrogates may remain sample-inefficient in very high-dimensional input spaces unless coupled with active subspace, sensitivity prewarping, or dimensionality reduction preprocessing (Wycoff et al., 2021).
Partitioning and Clustering Heuristics: The choice of partitioning (e.g., number of clusters in LDM, depth in Tree-PCE, kernel lengthscales in local GPs) impacts both fidelity and computational cost, often requiring cross-validation, entropy measures, or silhouette statistics (Dupuis et al., 2019, Said et al., 16 Sep 2025).
Non-intrusive vs. Intrusive ROMs: For physics-based problems, non-intrusive approaches avoid code modification but may be restricted to linear problems (as with purely linear PGD/Schwarz). Extension to non-linear PDEs demands hyper-reduction, additional stabilization, or hybrid operator learning (Discacciati et al., 12 Sep 2024, Discacciati et al., 2 Aug 2025).
Interface and Boundary Scaling: Hard local partitions (e.g., regime boundaries) can induce discontinuities or artifacts at boundaries; soft blending or mixture-of-experts surrogates may mitigate but risk "regime mixing" (Dupuis et al., 2019, Said et al., 16 Sep 2025).
Cost vs. Accuracy Trade-Offs: Increasing refinement, number of surrogates, or expansion degree delivers higher accuracy but raises storage and runtime; optimal configuration is generally found by automated hyperparameter search or budget-constrained refinement (Said et al., 16 Sep 2025).

Emerging research addresses hybridization of local surrogate approaches with global sensitivity analysis (e.g., sensitivity prewarping), multi-fidelity hierarchies, on-the-fly surrogate construction from automatic linearizations, and cross-regime transfer avoidance (LOL-GP). Hierarchical partitioning (Tree-PCE), ensemble blending, and partition-of-unity frameworks are being developed to further exploit locality while ensuring global consistency and continuity.

Cited works:

(Mattis et al., 2018, Hong et al., 2021, Cole et al., 2021, Discacciati et al., 12 Sep 2024, Discacciati et al., 2 Aug 2025, Said et al., 16 Sep 2025, Dupuis et al., 2019, Dupuis et al., 2019, Rønne et al., 2022, Wycoff et al., 2021, Wang et al., 16 Oct 2024, Hu et al., 2020, Olucha et al., 31 Mar 2025, Lopez et al., 27 Dec 2024)