Hybrid FNO-DeepONet: Neural Operator Models

Updated 1 February 2026

Hybrid FNO-DeepONet is a neural operator architecture that combines FNO for capturing global spatial correlations with DeepONet’s branch/trunk framework for learning parametric PDEs.
It embeds spectral convolution layers within the branch or post-merge networks to reduce parameter count, memory usage, and improve predictive accuracy in complex multiphysics problems.
The approach supports applications such as 3D carbon sequestration and porous media flow, demonstrating robust extrapolation capabilities and computational efficiency compared to traditional FNOs.

Hybrid FNO-DeepONet models are neural operator architectures designed to learn solution operators for parametric partial differential equations (PDEs), with a particular focus on systems coupling complex spatial fields and temporal evolution. These models combine two leading neural operator approaches: the Fourier Neural Operator (FNO) for efficient global spatial correlation modeling, and Deep Operator Networks (DeepONet) for general operator learning via decoupled branch and trunk networks. By integrating FNO blocks—typically in the branch, trunk, or post-merge layers—Hybrid FNO-DeepONet models achieve reduced parameter count, lowered memory usage, and strong predictive accuracy for challenging multiphysics problems.

1. Architectural Fundamentals

Hybrid FNO-DeepONet architectures are constructed upon the DeepONet framework, wherein an unknown operator $\mathcal{G}\colon u \mapsto p$ is approximated by two neural networks:

Branch network: Encodes high-dimensional, often infinite-dimensional input functions ( $u$ ), such as permeability maps or physical fields.
Trunk network: Encodes the coordinates or query points (e.g., $x, t$ ).

The classical DeepONet computes the output via an inner product: $\mathcal{G}_\theta(u)(x, t) = \sum_{k=1}^{r} b_k(u)(x) \; t_k(t)$ where $b_k$ and $t_k$ are the branch and trunk network outputs, respectively.

Hybridization is achieved by embedding FNO convolutional layers into one or both of these networks. For example, the branch may be replaced with several FNO layers to capture nonlocal spatial correlations, while the trunk retains a multi-layer perceptron (MLP) or a Kolmogorov–Arnold Network (KAN) to model time. Alternative variants augment the post-merge stage with FNO spectral convolutions (as in the nested Fourier-DeepONet) (Lee et al., 2024, Santos et al., 4 Nov 2025).

2. Mathematical Formulation and Network Topologies

Different variants employ the FNO at distinct stages:

FNO-embedded branch: The branch processes spatial fields $v(\mathbf{x})$ through lifting ( $P$ ), $T$ FNO layers (each defined by convolution in Fourier space),

$z_{n+1}(\mathbf{x}) = \sigma\left(\mathcal{F}^{-1}\left[\mathbf{R}\,\mathcal{F}(z_n)\right](\mathbf{x}) + W z_n(\mathbf{x})\right)$

and subsequent projection ( $u$ 0) to produce $u$ 1.

Trunk network: Encodes time or query coordinates ( $u$ 2) and outputs basis functions $u$ 3 via either an MLP,

$u$ 4

or KAN,

$u$ 5

with learnable spline activations.

Post-merge spectral convolution: In nested Fourier-DeepONet, after merging the branch and trunk via element-wise product $u$ 6, $u$ 7 spectral FNO layers act on $u$ 8, restricted to 3D FFTs over spatial coordinates only,

$u$ 9

maintaining temporal flexibility and reduced computational burden.

The architectures enable hierarchical (nested) refinement by feeding the outputs of a coarser network into the next-finer scale operator (Lee et al., 2024).

3. Training Protocols and Hyperparameter Selection

Typical training procedures use mean squared error (MSE) or $x, t$ 0 relative loss over the spatial-temporal grid for supervised surrogate modeling of physics-based simulation data.

Loss function: For pressure, the nested Fourier-DeepONet employs

$x, t$ 1

Similar formulations are used for other quantities (e.g., saturation) with indicator functions for relevant regions.

Datasets: Models are trained on large physics-driven datasets, such as 3,009 full-physics ECLIPSE-300 simulations for 3D CO $x, t$ 2 sequestration (Lee et al., 2024), or black-oil reservoir simulations for 2D/3D SPE10 benchmarks (Santos et al., 4 Nov 2025). The input fields typically include spatially heterogeneous characteristics (permeability, porosity, temperature) and parametric details (injection rates, well locations).
Optimization: Adam or AdamW optimizers with cosine decay or exponential learning-rate schedules are standard. Batch sizes range from single field samples (spatial batch) to small time-step minibatches (temporal batch).
Hyperparameters: For instance, nested Fourier-DeepONet uses $x, t$ 3 feature channels, $x, t$ 4 Fourier layers, and typically retains 12 Fourier modes per spatial dimension. The largest FNO-DeepONet models reach up to $x, t$ 5 million parameters, compared to $x, t$ 6 million for pure FNOs.

4. Computational Performance and Predictive Accuracy

Hybrid FNO-DeepONet architectures demonstrate substantial improvements in memory footprint, training time, and extrapolation accuracy relative to monolithic FNO models.

Memory and parameter efficiency: Nested Fourier-DeepONet trained for 3D carbon sequestration achieves 80–85% reductions in memory and parameters, and a 60% decrease in training time, compared to pure FNOs on the same NVIDIA A100 GPU (Lee et al., 2024). For instance, global-level models require 13.1 million parameters (vs. 80.3 million for FNO) and 4.9 GiB GPU memory (vs. 33.1 GiB).
Predictive accuracy: On CO $x, t$ 7 pressure prediction, nested Fourier-DeepONet attains 0.57% relative error post fine-tuning, outperforming FNO (0.63%). For saturation, errors are 1.46% (vs. 1.62% for FNO). In multiphase reservoir modeling, FNO-DeepONet achieves few-percent error across pressure, saturation, and production rates; e.g., oil saturation $x, t$ 8 error of 6.75–21.4%, and field WOPR/WWCT at $x, t$ 9 for large networks (Santos et al., 4 Nov 2025).
Extrapolation: Hybrid models exhibit robust extrapolation capability in time, well counts, permeability magnitude, and injection rates. For temporal extrapolation to 30 years, nested Fourier-DeepONet error rises by less than 1% for pressure (vs. 7% for FNO), and around 10% for saturation, credited to continuous time modeling in the trunk (Lee et al., 2024).

5. Advantages and Limitations

Hybrid FNO-DeepONet models present several strength and weaknesses:

Advantages

Significant reduction in parameter count and GPU memory by restricting FFTs to spatial dimensions and decoupling time via trunk networks.
Efficient learning of global spatial correlations via FNO; continuous and super-resolved temporal query capability via MLP/KAN trunks.
Automatic conservation of key physical constraints (e.g., $\mathcal{G}_\theta(u)(x, t) = \sum_{k=1}^{r} b_k(u)(x) \; t_k(t)$ 0).
Robustness under nested refinement, enabling hierarchical mesh resolution.
Orders-of-magnitude faster inference— $\mathcal{G}_\theta(u)(x, t) = \sum_{k=1}^{r} b_k(u)(x) \; t_k(t)$ 1 seconds per sample on high-resolution grids (Santos et al., 4 Nov 2025).

Limitations

FNO layers typically assume rectangular/periodic domains; handling irregular boundaries may require domain adaptation or augmentation.
Trunk network's expressivity limits long-horizon temporal extrapolation; may necessitate autoregressive or state-space enhancements for extended dynamics.
Data-driven surrogates require representative training across all operational regimes; the nested design requires individually trained models per grid level.
Incorporating physics-informed losses (PINN-style) and direct enforcement of boundary/interface conditions may further enhance reliability, especially near shocks or discontinuities.

6. Applications and Theoretical Significance

Hybrid FNO-DeepONet models have been validated in computationally intensive settings including:

3D geological carbon sequestration, predicting CO $\mathcal{G}_\theta(u)(x, t) = \sum_{k=1}^{r} b_k(u)(x) \; t_k(t)$ 2 migration and pressure distribution under large-scale multiphase flow conditions (Lee et al., 2024).
Multiphase flow in porous media, including 2D/3D SPE10 black-oil models, steady-state Darcy flow, and associated production metrics (Santos et al., 4 Nov 2025).
Stochastic nonlinear structural system response to natural hazards such as earthquakes and wind, with FNO-DeepONet (DeepFNOnet) used to learn discrepancies between ground truth and baseline DeepONet predictions (Goswami et al., 16 Feb 2025).

Performance benchmarks demonstrate the hybrid operators’ generalization across training ranges and strong extrapolation for key physical variables. The framework’s flexibility in decoupling spatial and temporal learning tasks supports efficient surrogate modeling for parametric, high-dimensional PDEs arising in Earth science, energy engineering, and structural dynamics.

The central theoretical insight is that operator learning via branch/trunk separation and FNO spectral modeling reduces the dimensionality bottleneck faced by conventional neural operators, while retaining sufficient expressivity for real-world scale settings. Hybrid FNO-DeepONet architectures represent a modular, memory-efficient, and generalizable class of neural surrogates for nonlinear operator mapping tasks.