Hybrid FNO-DeepONet Models

Updated 25 February 2026

Hybrid FNO-DeepONet models are neural operator architectures that combine FNOs and DeepONets to map high-dimensional functions, particularly in PDE settings.
They leverage a modular branch/trunk design, where the spatial encoding via FNO and parametric learning via DeepONet (or alternatives) enable significant reductions in parameters, memory use, and training times.
The approach achieves superior generalization and extrapolation in applications like geological carbon sequestration and multiphase flow, offering scalable surrogate modeling for complex physics.

Hybrid FNO-DeepONet models represent a class of neural operator architectures that combine the structural and algorithmic strengths of Fourier Neural Operators (FNOs) and Deep Operator Networks (DeepONets) within a unified framework. These hybrid models are designed for learning mappings between function spaces—particularly the parameter-to-solution maps arising in high-dimensional, nonlinear, and parametric partial differential equations (PDEs)—while addressing the scalability and generalization limitations that often arise when using FNO or DeepONet in isolation. Their foundational innovation lies in decoupling spatial and temporal (or more generally, parametric) learning tasks, enabling efficient surrogate modeling of complex, multiscale, and multiphysics systems.

1. Mathematical and Architectural Foundations

Hybrid FNO-DeepONet architectures instantiate a modular approach in which operator learning is partitioned into branch and trunk subnetworks, corresponding to spatial and temporal/parametric representations, respectively. The canonical DeepONet operator approximation takes the form: $\mathcal{G}(v)(\,\xi) = \sum_{k=1}^r b_k(v)\; t_k(\xi)$ where the branch network $b(v)\in\mathbb{R}^r$ encodes input functions or fields (e.g., permeability, initial conditions) and the trunk network $t(\xi)\in\mathbb{R}^r$ encodes the coordinates or auxiliary input (e.g., time, query locations).

In a hybrid scheme, the branch network is augmented or replaced by a Fourier Neural Operator, which performs spectral convolutions in the spatial domain: $z_{n+1}(x) = \sigma\left(\mathcal{F}^{-1}\bigl(R\cdot \mathcal{F}(z_n)\bigr)(x) + W\,z_n(x)\right)$ where $R$ comprises learnable weights acting on the dominant Fourier modes, $W$ is a channel-mixing matrix, and $\sigma$ is a nonlinearity (typically GELU or Tanh). The trunk network may retain the MLP structure or be further replaced with alternatives such as Kolmogorov–Arnold Networks (KANs), which use univariate spline expansions.

The merge step is realized by the elementwise Hadamard product (pointwise multiplication) of the spatially encoded output $b(v)$ and the parametric representation $t(\xi)$ , followed by suitable reshaping to reconstruct the predicted solution on the spatiotemporal grid: $\widehat u(\mathbf x,\,t) = \bigl[b(v)\odot t(\xi)\bigr]_{(\mathbf x,t)}$

This modular design admits “drop-in” substitution of FNO, MLP, or KAN architectures within both branch and trunk networks (Santos et al., 4 Nov 2025).

2. Computational Efficiency and Memory Scaling

Empirical benchmarks in large-scale PDE surrogate tasks, including three-dimensional geological carbon sequestration and multiphase porous-media flow, demonstrate that hybrid FNO-DeepONet models drastically reduce trainable parameter counts, memory requirements, and wall-clock training times compared to pure FNO approaches. For example, in nested multi-resolution reservoir modeling, the Fourier-DeepONet achieves:

Model/Level	Params (M)	GPU Mem (GiB)	Wall Time (h)
FNO (global)	80.3	33.1	37.6
Fourier-DeepONet	13.1	4.9	14.9
FNO (local)	150.5	18.5–25.4	41.7–61.3
Fourier-DeepONet	20.8	3.3–5.0	20.5–28.3

Fourier-DeepONet consistently reduces resource usage by at least 80% for equivalent or superior predictive performance (Lee et al., 2024). In 3D porous media, only the hybrid branch/trunk approach enables scaling to tens of millions of parameters on a single high-memory GPU, as in oil–water flow models where naïve MLP/KAN branches fail due to memory bottlenecks (Santos et al., 4 Nov 2025).

3. Generalization and Extrapolation Properties

A defining advantage of the hybrid FNO-DeepONet architecture emerges in its capability for interpolation and extrapolation with respect to time, physical parameters, and system configurations. Sequencing the FNO with DeepONet trunk subnets, which naturally encode the continuous and smooth dependence on parametric or temporal coordinates, yields:

Significantly lower out-of-sample errors for extrapolation in time (e.g., >50% reduced error versus nested FNO when predicting pressure and gas saturation in GCS beyond the training time horizon).
Stable performance when extrapolating to unseen values in the space of permeability, injection rates, or number of wells; e.g., median reservoir pressure error remains <2% when trained on 1–3 wells and validated on 4-well scenarios.
Attenuated error increases in extrapolation regimes, attributed to the trunk network’s explicit, low-dimensional time/parameter encoding—contrasting FNO’s independent treatment of time slices as separate input channels (Lee et al., 2024).

A plausible implication is that the branch-trunk decoupled architecture is inherently more robust to distributional shift in the parametric subspace compared to pure FNO or DeepONet.

4. Applications in High-Dimensional and Multiphysics Systems

Hybrid FNO-DeepONet surrogates demonstrate accelerated and accurate PDE solution emulation in applications including:

Geological carbon sequestration: Prediction of pressure buildup and CO $_2$ saturation fields in large-scale 3D reservoirs, with local grid refinement and nested multi-resolution modeling around injection wells (Lee et al., 2024).
Multiphase and compositional flow in porous media: Surrogates for the $10$th Comparative Solution Project (SPE10), yielding orders-of-magnitude speedup in spatiotemporal field and scalar (e.g., oil/water production) predictions. Inference runs in seconds compared to hours for high-fidelity finite-volume solvers, enabling efficient history matching, UQ, and optimization (Santos et al., 4 Nov 2025).
Nonlinear structural response under stochastic hazards: The DeepFNOnet architecture employs a standard DeepONet for the initial solution and a correction FNO to model the solution–prediction discrepancy, enabling fast, accurate surrogate response prediction under earthquake and wind loading (Goswami et al., 16 Feb 2025).

These applications leverage the modularity of spatial and parametric learning, as well as the ability to integrate physics-informed constraints (e.g., PDE residuals, boundary conditions) in the loss functional for further improvement in data-sparse regimes (Lee et al., 2024).

5. Implementation and Training Protocols

Key elements and recommended practices for practical deployment include:

Branch/trunk subnet design: Employ a shallow MLP or FNO in the branch for spatial encoding and a compact MLP or KAN in the trunk for time/parameter. Typical architectures use two 32-unit linear layers in both subnets, with GELU or SiLU nonlinearities; the FNO branch uses four spectral layers with 20–30 retained modes per dimension (Lee et al., 2024, Santos et al., 4 Nov 2025).
Multi-resolution/nested hierarchies: Construct a hierarchy of hybrid surrogates operating over global-to-local domains, passing predicted fields as extra inputs to finer levels. This design mirrors adaptive mesh refinement in physics solvers and mitigates error accumulation through residual correction and noise perturbation during training (Lee et al., 2024).
Loss and optimization: Utilization of relative $L^2$ errors per domain, Adam or AdamW optimizers with scheduled learning rate decay, and cross-validation to tune activation and normalization choices. No batch/layer normalization is required.
Data and input preparation: Normalize input fields and target solutions; partition parameter space into interpolation/extrapolation bins for robust evaluation; train using datasets covering realistic variability in all critical physical parameters.

6. Performance Evaluation and Comparative Metrics

Benchmark results in published studies confirm:

Hybrid FNO-DeepONet models consistently outperform pure MLP, KAN, or DeepONet surrogates in relative $L_2$ and $L_{\infty}$ error, especially for high-dimensional and strongly coupled problems.
For 2D Darcy flow, hybrid FNO+MLP and FNO+KAN errors are $\|p\|_{\text{rel}} \approx 0.048$ –$0.050$, well below MLP ($0.0635$) or KAN ($0.0545$) (Santos et al., 4 Nov 2025).
In 3D reservoir models, hybrids achieve relative errors down to $6.7\%$ for main field variables at a parameter budget one or two orders of magnitude lower than pure FNOs, and achieve wall-times of $<30$ hours for problems inaccessible to MLP/KAN-based surrogates due to memory limits.

A plausible implication is that the hybrid approach extends the practical reach of neural operator surrogates to domains requiring both rapid many-query evaluation and scalability to industrially relevant spatial-temporal resolutions.

7. Extensions and Generalization to Other Physics

The hybrid FNO-DeepONet approach is generalizable to problems characterized by local mesh refinement (e.g., regions near singularities or solid–fluid interfaces), vector/tensor-valued outputs (via modified output projections), and additional parameter sets beyond time (by extending trunk subnet input space). Incorporation of physics-constrained loss formulations further enhances extrapolation and data-efficiency. The trunk network can be naturally extended to accommodate physical scalars such as temperature gradients or rock compressibility to accommodate domain shifts not encountered during training (Lee et al., 2024).

These features identify hybrid FNO-DeepONet frameworks as a foundational tool for accelerated, high-fidelity surrogate modeling in computational physics, geosciences, engineering, and related disciplines.