Deep Operator Surrogates
- Deep Operator Surrogates are neural architectures that learn mappings between infinite-dimensional function spaces, enabling prediction of PDE solution fields.
- They leverage innovative architectures such as DeepONet and Fourier Neural Operator to combine branch and trunk networks for efficient operator approximation.
- These surrogates significantly lower computational costs in many-query inference, uncertainty quantification, and real-time control in engineering and physics applications.
Deep operator surrogates are neural architectures designed to learn mappings between infinite-dimensional function spaces, most notably operator-valued maps arising from parametrized and stochastic partial differential equations (PDEs). Their central aim is to construct models that, once trained, can predict the solution field(s) of complex parameterized PDEs at negligible online cost, bypassing traditional, computationally intensive solvers. This operator-centric perspective transcends conventional pointwise regression, enabling rapid many-query inference and empowering new regimes in uncertainty quantification, design, and control for physical and engineering systems.
1. Operator Learning Framework
Classically, many problems in computational science require learning a nonlinear, infinite-dimensional map
where are (typically separable) Banach or Hilbert spaces—e.g., inputting a spatially varying coefficient, initial/boundary condition, or geometry and outputting the corresponding state field solving a PDE. The “deep operator surrogate” abstracts this mapping and seeks a parameterized neural network such that for all in a relevant subset of the input space (Herrmann et al., 2022, Goswami et al., 2022).
This paradigm, formalized by the universal operator approximation theorem (Goswami et al., 2022), generalizes classical function approximation: instead of mapping vectors to vectors, the surrogate models function-to-function (i.e., operator) mappings.
2. Architectures and Unified Surrogate Constructions
2.1 Canonical Neural Operator Architectures
DeepONet (Goswami et al., 2022, Santos et al., 4 Nov 2025, Park et al., 15 Sep 2025)
- Branch network handles the discretized (typically high-dimensional) input function, outputting a latent representation.
- Trunk network encodes the query coordinate (or coordinates), providing a basis for pointwise output reconstruction.
- The output is a bi-linear combination (inner product) of branch and trunk embeddings:
where are sensor points and are neural network outputs.
Fourier Neural Operator (FNO) (Herrmann et al., 2022, Santos et al., 4 Nov 2025, Sahadath et al., 7 Feb 2026)
- Operates directly on spatial fields, employing Fourier layers that globally propagate information in spectral space.
- Admits resolution-invariant surrogates via convolution in Fourier domain followed by a projection.
Basis–coefficient (separation-of-variables) surrogates (Jeon et al., 26 Mar 2025, Herrmann et al., 2022)
- Expand the output field in a learned basis, with coefficients parameterized as functions of the input.
- Structure: .
Local-assembly neural surrogates (Kröpfl et al., 2021)
- Neuro-approximation of local surrogate maps for operator compression—mirrors classical upscaling and homogenization, using moderate-sized subnetworks for local operator assembly.
2.2 Hybrid and Advanced Variants
- Hybrid DeepONets: Use FNO, Kolmogorov–Arnold networks (KAN), or MLPs in modular fashion within branch and trunk networks for task-adaptive spatial/temporal representation (Santos et al., 4 Nov 2025).
- Full-field Extended DeepONets (FExD): Simultaneously predict all responses across spatial DoFs for spatio-temporal surrogacy, leveraging nonlinear branch–trunk interactions (Tang et al., 13 Jun 2025).
- Physics-enhanced deep surrogates: Integrate neural components with differentiable PDE solvers, imposing sharp physics constraints and enforcing interpretability and data efficiency (Varagnolo et al., 25 Nov 2025, Pestourie et al., 2021).
3. Training Strategies and Theoretical Guarantees
3.1 Training Paradigms
- Data-driven regression: Classical empirical risk minimization using pointwise or fieldwise mean-squared error against labeled simulation data (Goswami et al., 2022, Choubey et al., 2024).
- Physics-informed or energy-based training: Learning via minimization of residuals (CPINN) or discrete energy functionals (e.g. DCRM), enabling surrogate training without labeled data (Fuhg et al., 2022).
- Two-step/truncation-aware training: Decoupling optimization of spatial (trunk) and parametric (branch) subnets to ensure stability, particularly for discontinuous outputs or limited data (Park et al., 15 Sep 2025, Shukla et al., 2024).
- Active learning and uncertainty quantification: Use of heteroskedastic Gaussian ensembles and acquisition-driven experimental design to minimize training dataset size (Varagnolo et al., 25 Nov 2025), and MC-dropout/Bayesian surrogates for calibrated prediction intervals (Jeon et al., 26 Mar 2025).
3.2 Approximation Rates and Error Bounds
Deep operator surrogates admit rigorous expression-rate (approximation error) bounds dictated by the Sobolev/Besov regularity of the input-output map and the size of the surrogate (Herrmann et al., 2022):
- For ReLU network surrogates acting in Sobolev or Besov scales,
where , are regularity indices of input/output spaces.
- For mean-square error over a parameter distribution, the exponent improves by ½.
- Spectral surrogates (generalized polynomial chaos, gPC) achieve the same algebraic rates, using deterministic interpolation based on functional evaluations.
4. Data Generation and Efficiency
The dominant cost in operator surrogate training often lies in high-fidelity data generation:
- Classical data workflow: Requires massive labeled simulation datasets (e.g. FEM/PDE solves) (Herrmann et al., 2022, Santos et al., 4 Nov 2025, Sahadath et al., 7 Feb 2026).
- Innovative approaches:
- GPR-based output randomization, reconstructing PDE sources via finite-difference stencils, yielding orders-of-magnitude acceleration in data generation and comparable surrogate accuracy to full-FEM training (Choubey et al., 2024).
- Physics-enhanced surrogates combine neural generators with low-fidelity (e.g., Fourier, finite-difference) solvers, leveraging physical priors for both accuracy and data reduction (up to 70%) (Pestourie et al., 2021, Varagnolo et al., 25 Nov 2025).
- Local-assembly neural compression for multiscale problems, where moderate patchwise networks assemble global surrogates; results in ~100× inference speed-up while controlling solution error (Kröpfl et al., 2021).
5. Application Scope and Performance Analysis
Deep operator surrogates have been validated and benchmarked in a range of domains:
| Domain / Application | Representative Operator | Network Variants | Test Error | Speed-up | Reference |
|---|---|---|---|---|---|
| Elliptic PDEs, Darcy flow | DeepONet, FNO, gPC | 1–5% (rel. L2) | (Goswami et al., 2022, Santos et al., 4 Nov 2025) | ||
| Multiphase porous media | permeabilities saturations | Hybrid DeepONet (FNO+KAN) | 2–3% (SGAS ) | (Santos et al., 4 Nov 2025) | |
| 3D Fluid flow, CFD | geometry + Re velocity field | DeepONet, Geometric-DeepONet | boundary | (Rabeh et al., 21 Mar 2025) | |
| Hypersonic aero. (shocks) | AoA pressure/heat flux fields | DeepONet (2-step, weighted) | (volume) | (Shukla et al., 2024) | |
| Poroelasticity | permeability {u,p}(x,t) | DeepONet (KLE-branch) | ~1–5% RMSE | (Park et al., 15 Sep 2025) | |
| Neutron transport, BTE | FNO, DeepONet | <1% | <0.3% of baseline | (Sahadath et al., 7 Feb 2026) | |
| Cyclic adsorption | DeepONet | <0.2% in-domain, <3% OOD | (Ceccanti et al., 14 Jan 2026) | ||
| Dynamical uncertain systems | ground motion bridge motion | FExD-DeepONet | RRMSE < 8% | vs. VD | (Tang et al., 13 Jun 2025) |
Key points:
- Variational and physics-informed training (e.g. DCRM, PINNs) enables effective surrogates even with zero labeled data, with energy-based approaches surpassing standard residual minimization in convergence and generalization (Fuhg et al., 2022).
- Hybrid architectures (e.g. FNO-branch DeepONet) yield significantly higher parameter efficiency and scaling compared to uniform architectures, enabling surrogate construction even for million-cell 3D problems on commodity hardware (Santos et al., 4 Nov 2025).
- Statistical surrogates with MC-dropout or ensemble methods provide well-calibrated uncertainty bounds, critical for decision-making under uncertainty (Jeon et al., 26 Mar 2025, Varagnolo et al., 25 Nov 2025).
6. Generalization, Robustness, and Limitations
Deep operator surrogates demonstrate strong generalization within the sampled data-manifold and, with informed architecture/training, credible extrapolation to unseen parameter regimes (Varagnolo et al., 25 Nov 2025, Qiu et al., 2024). Notable findings include:
- Incorporation of derivative losses (in input or output directions) and dimension reduction boosts data efficiency by up to an order of magnitude and reduces the error on functional sensitivities (Qiu et al., 2024).
- Embedding low-fidelity physics enhances learning in underdetermined, small-data regimes and improves out-of-distribution robustness (Varagnolo et al., 25 Nov 2025, Pestourie et al., 2021).
Nevertheless, limitations remain:
- Out-of-distribution generalization is contingent on sufficient architectural expressiveness (e.g. geometry encoding, branch/trunk richness) and, in cases such as the ballistic-to-diffusive transition, physically meaningful mixing strategies.
- Training cost (offline phase) remains significant, particularly for high-dimensional stochastic PDEs, although active learning and modularization mitigate this.
- Extensions to evolving, multi-physics, or time-dependent domain topologies require further algorithmic development (Santos et al., 4 Nov 2025).
7. Outlook and Frontiers
Current research is advancing in several directions:
- Scalable multi-input/multi-output networks for multi-physics applications and inclusion of mechanical/material parameters as auxiliary operator inputs (Santos et al., 4 Nov 2025, Park et al., 15 Sep 2025).
- Physics-informed and energy-based surrogates for high-order, nonlinear, or variational PDEs where training data is scarce or costly (Fuhg et al., 2022).
- Multi-fidelity surrogates and transfer learning, leveraging hierarchies of models from low- to high-accuracy for rapid adaptation to new regimes (Varagnolo et al., 25 Nov 2025).
- Integration with optimal experimental design, uncertainty quantification, and closed-loop control workflows in real-time digital twins.
- Theoretical analysis of surrogate expressibility in function space and strategies for adaptive surrogate refinement (Herrmann et al., 2022).
Deep operator surrogates thus provide a mathematically principled and computationally effective route to operator learning, establishing a new foundation for scalable, physically consistent simulation and design across scientific domains. Their ability to directly approximate solution operators, rather than pointwise maps, will remain pivotal as simulation-driven discovery moves toward real-time and many-query paradigms.