Deep Operator Surrogates

Updated 26 February 2026

Deep Operator Surrogates are neural architectures that learn mappings between infinite-dimensional function spaces, enabling prediction of PDE solution fields.
They leverage innovative architectures such as DeepONet and Fourier Neural Operator to combine branch and trunk networks for efficient operator approximation.
These surrogates significantly lower computational costs in many-query inference, uncertainty quantification, and real-time control in engineering and physics applications.

Deep operator surrogates are neural architectures designed to learn mappings between infinite-dimensional function spaces, most notably operator-valued maps arising from parametrized and stochastic partial differential equations (PDEs). Their central aim is to construct models that, once trained, can predict the solution field(s) of complex parameterized PDEs at negligible online cost, bypassing traditional, computationally intensive solvers. This operator-centric perspective transcends conventional pointwise regression, enabling rapid many-query inference and empowering new regimes in uncertainty quantification, design, and control for physical and engineering systems.

1. Operator Learning Framework

Classically, many problems in computational science require learning a nonlinear, infinite-dimensional map

$G : X \to Y$

where $X, Y$ are (typically separable) Banach or Hilbert spaces—e.g., inputting a spatially varying coefficient, initial/boundary condition, or geometry $a(\cdot)$ and outputting the corresponding state field $u(\cdot)$ solving a PDE. The “deep operator surrogate” abstracts this mapping and seeks a parameterized neural network $G_\theta : X \to Y$ such that $G_\theta(a) \approx G(a)$ for all $a$ in a relevant subset of the input space (Herrmann et al., 2022, Goswami et al., 2022).

This paradigm, formalized by the universal operator approximation theorem (Goswami et al., 2022), generalizes classical function approximation: instead of mapping vectors to vectors, the surrogate models function-to-function (i.e., operator) mappings.

2. Architectures and Unified Surrogate Constructions

2.1 Canonical Neural Operator Architectures

DeepONet (Goswami et al., 2022, Santos et al., 4 Nov 2025, Park et al., 15 Sep 2025)

Branch network handles the discretized (typically high-dimensional) input function, outputting a latent representation.
Trunk network encodes the query coordinate (or coordinates), providing a basis for pointwise output reconstruction.
The output is a bi-linear combination (inner product) of branch and trunk embeddings:

$\widehat{G(a)}(y) = \sum_{k=1}^p b_k(a(\eta_1),...)\, t_k(y)$

where $\{\eta_j\}$ are sensor points and $b_k, t_k$ are neural network outputs.

Fourier Neural Operator (FNO) (Herrmann et al., 2022, Santos et al., 4 Nov 2025, Sahadath et al., 7 Feb 2026)

Operates directly on spatial fields, employing Fourier layers that globally propagate information in spectral space.
Admits resolution-invariant surrogates via convolution in Fourier domain followed by a projection.

Basis–coefficient (separation-of-variables) surrogates (Jeon et al., 26 Mar 2025, Herrmann et al., 2022)

Expand the output field in a learned basis, with coefficients parameterized as functions of the input.
Structure: $f(z)(s) = \sum_{k=1}^K B_k(z) \, \eta_k(s)$ .

Local-assembly neural surrogates (Kröpfl et al., 2021)

Neuro-approximation of local surrogate maps for operator compression—mirrors classical upscaling and homogenization, using moderate-sized subnetworks for local operator assembly.

2.2 Hybrid and Advanced Variants

Hybrid DeepONets: Use FNO, Kolmogorov–Arnold networks (KAN), or MLPs in modular fashion within branch and trunk networks for task-adaptive spatial/temporal representation (Santos et al., 4 Nov 2025).
Full-field Extended DeepONets (FExD): Simultaneously predict all responses across spatial DoFs for spatio-temporal surrogacy, leveraging nonlinear branch–trunk interactions (Tang et al., 13 Jun 2025).
Physics-enhanced deep surrogates: Integrate neural components with differentiable PDE solvers, imposing sharp physics constraints and enforcing interpretability and data efficiency (Varagnolo et al., 25 Nov 2025, Pestourie et al., 2021).

3. Training Strategies and Theoretical Guarantees

3.1 Training Paradigms

Data-driven regression: Classical empirical risk minimization using pointwise or fieldwise mean-squared error against labeled simulation data (Goswami et al., 2022, Choubey et al., 2024).
Physics-informed or energy-based training: Learning via minimization of residuals (CPINN) or discrete energy functionals (e.g. DCRM), enabling surrogate training without labeled data (Fuhg et al., 2022).
Two-step/truncation-aware training: Decoupling optimization of spatial (trunk) and parametric (branch) subnets to ensure stability, particularly for discontinuous outputs or limited data (Park et al., 15 Sep 2025, Shukla et al., 2024).
Active learning and uncertainty quantification: Use of heteroskedastic Gaussian ensembles and acquisition-driven experimental design to minimize training dataset size (Varagnolo et al., 25 Nov 2025), and MC-dropout/Bayesian surrogates for calibrated prediction intervals (Jeon et al., 26 Mar 2025).

3.2 Approximation Rates and Error Bounds

Deep operator surrogates admit rigorous expression-rate (approximation error) bounds dictated by the Sobolev/Besov regularity of the input-output map and the size of the surrogate (Herrmann et al., 2022):

For ReLU network surrogates acting in Sobolev or Besov scales,

$\sup_{a \in C^s_r(X)} \|G(a) - D[\tilde G_N(a)]\|_Y \leq C N^{-\alpha + \delta}$

where $\alpha=\min \{s-1, t\}$ , $s,t$ are regularity indices of input/output spaces.

For mean-square error over a parameter distribution, the exponent improves by ½.
Spectral surrogates (generalized polynomial chaos, gPC) achieve the same algebraic rates, using deterministic interpolation based on functional evaluations.

4. Data Generation and Efficiency

The dominant cost in operator surrogate training often lies in high-fidelity data generation:

Classical data workflow: Requires massive labeled simulation datasets (e.g. FEM/PDE solves) (Herrmann et al., 2022, Santos et al., 4 Nov 2025, Sahadath et al., 7 Feb 2026).
Innovative approaches:
- GPR-based output randomization, reconstructing PDE sources via finite-difference stencils, yielding orders-of-magnitude acceleration in data generation and comparable surrogate accuracy to full-FEM training (Choubey et al., 2024).
- Physics-enhanced surrogates combine neural generators with low-fidelity (e.g., Fourier, finite-difference) solvers, leveraging physical priors for both accuracy and data reduction (up to 70%) (Pestourie et al., 2021, Varagnolo et al., 25 Nov 2025).
- Local-assembly neural compression for multiscale problems, where moderate patchwise networks assemble global surrogates; results in ~100× inference speed-up while controlling solution error (Kröpfl et al., 2021).

5. Application Scope and Performance Analysis

Deep operator surrogates have been validated and benchmarked in a range of domains:

Domain / Application	Representative Operator	Network Variants	Test Error	Speed-up	Reference
Elliptic PDEs, Darcy flow	$a(\cdot) \mapsto u(\cdot)$	DeepONet, FNO, gPC	1–5% (rel. L2)	$10^2-10^3\times$	(Goswami et al., 2022, Santos et al., 4 Nov 2025)
Multiphase porous media	permeabilities $\rightarrow$ saturations	Hybrid DeepONet (FNO+KAN)	2–3% (SGAS $L^2$ )	$\sim 10^4\times$	(Santos et al., 4 Nov 2025)
3D Fluid flow, CFD	geometry + Re $\mapsto$ velocity field	DeepONet, Geometric-DeepONet	$<10\%$ boundary	$10^4\times$	(Rabeh et al., 21 Mar 2025)
Hypersonic aero. (shocks)	AoA $\mapsto$ pressure/heat flux fields	DeepONet (2-step, weighted)	$<5\%$ (volume)	$10^4\!-\!10^5\times$	(Shukla et al., 2024)
Poroelasticity	permeability $\mapsto$ {u,p}(x,t)	DeepONet (KLE-branch)	~1–5% RMSE	$10^4\times$	(Park et al., 15 Sep 2025)
Neutron transport, BTE	$Q(x,\mu) \mapsto \psi(x,\mu)$	FNO, DeepONet	<1%	<0.3% of baseline	(Sahadath et al., 7 Feb 2026)
Cyclic adsorption	$\text{IC}(\cdot) \mapsto u(x,t)$	DeepONet	<0.2% in-domain, <3% OOD	$10^4\times$	(Ceccanti et al., 14 Jan 2026)
Dynamical uncertain systems	ground motion $\mapsto$ bridge motion	FExD-DeepONet	RRMSE < 8%	$20\times$ vs. VD	(Tang et al., 13 Jun 2025)

Key points:

Variational and physics-informed training (e.g. DCRM, PINNs) enables effective surrogates even with zero labeled data, with energy-based approaches surpassing standard residual minimization in convergence and generalization (Fuhg et al., 2022).
Hybrid architectures (e.g. FNO-branch DeepONet) yield significantly higher parameter efficiency and scaling compared to uniform architectures, enabling surrogate construction even for million-cell 3D problems on commodity hardware (Santos et al., 4 Nov 2025).
Statistical surrogates with MC-dropout or ensemble methods provide well-calibrated uncertainty bounds, critical for decision-making under uncertainty (Jeon et al., 26 Mar 2025, Varagnolo et al., 25 Nov 2025).

6. Generalization, Robustness, and Limitations

Deep operator surrogates demonstrate strong generalization within the sampled data-manifold and, with informed architecture/training, credible extrapolation to unseen parameter regimes (Varagnolo et al., 25 Nov 2025, Qiu et al., 2024). Notable findings include:

Incorporation of derivative losses (in input or output directions) and dimension reduction boosts data efficiency by up to an order of magnitude and reduces the error on functional sensitivities (Qiu et al., 2024).
Embedding low-fidelity physics enhances learning in underdetermined, small-data regimes and improves out-of-distribution robustness (Varagnolo et al., 25 Nov 2025, Pestourie et al., 2021).

Nevertheless, limitations remain:

Out-of-distribution generalization is contingent on sufficient architectural expressiveness (e.g. geometry encoding, branch/trunk richness) and, in cases such as the ballistic-to-diffusive transition, physically meaningful mixing strategies.
Training cost (offline phase) remains significant, particularly for high-dimensional stochastic PDEs, although active learning and modularization mitigate this.
Extensions to evolving, multi-physics, or time-dependent domain topologies require further algorithmic development (Santos et al., 4 Nov 2025).

7. Outlook and Frontiers

Current research is advancing in several directions:

Scalable multi-input/multi-output networks for multi-physics applications and inclusion of mechanical/material parameters as auxiliary operator inputs (Santos et al., 4 Nov 2025, Park et al., 15 Sep 2025).
Physics-informed and energy-based surrogates for high-order, nonlinear, or variational PDEs where training data is scarce or costly (Fuhg et al., 2022).
Multi-fidelity surrogates and transfer learning, leveraging hierarchies of models from low- to high-accuracy for rapid adaptation to new regimes (Varagnolo et al., 25 Nov 2025).
Integration with optimal experimental design, uncertainty quantification, and closed-loop control workflows in real-time digital twins.
Theoretical analysis of surrogate expressibility in function space and strategies for adaptive surrogate refinement (Herrmann et al., 2022).

Deep operator surrogates thus provide a mathematically principled and computationally effective route to operator learning, establishing a new foundation for scalable, physically consistent simulation and design across scientific domains. Their ability to directly approximate solution operators, rather than pointwise maps, will remain pivotal as simulation-driven discovery moves toward real-time and many-query paradigms.