Operator Neural Networks Overview

Updated 17 December 2025

Operator neural networks are deep models that approximate mappings between families of functions, crucial for solving PDEs and dynamical systems.
They utilize branch and trunk subnetworks to learn nonlinear functionals and basis functions, achieving universal approximation with parameter efficiency.
Applications include fluid dynamics, climate modeling, and inverse problems, providing robust surrogate solutions across varying conditions.

Operator neural networks are a class of deep learning models designed to approximate nonlinear operators, which are mappings between infinite-dimensional function spaces such as those arising in partial differential equations, dynamical systems, and scientific computing. Unlike classical neural networks that model finite-dimensional functions, operator neural networks can learn the relationship between entire families of input functions and their associated outputs, enabling efficient surrogate modeling, generalization over solution families, and new approaches to high-dimensional scientific problems.

1. Universal Approximation Principles and Operator Network Formulations

Operator neural networks generalize the universal approximation concept to operator mappings, as formalized by the Chen–Chen theorem: any continuous operator $G : \mathcal{U} \to \mathcal{V}$ with $\mathcal{U}$ , $\mathcal{V}$ Banach spaces can be approximated uniformly on compacts by neural operator architectures using finite sensor evaluations and nonlinear functionals (Goswami et al., 2022, Yu et al., 2021). The canonical operator neural network structure (e.g., DeepONet) takes as input the discretized values of a function at $m$ sensor points, processes these with a "branch" subnetwork, and combines the result with evaluations from a "trunk" subnetwork that maps output coordinates (space, time, parameters) to basis functions. The generic network output for operator $G$ is then

$\widehat{G}(u)(y) = \sum_{k=1}^p b_k(u(x_1), \ldots, u(x_m))\, t_k(y)$

where $b_k$ are neural representations of nonlinear functionals, $t_k$ are the output basis functions generated by a neural network, and $y$ is any point in the output domain.

Universal approximation results for operator neural networks show that arbitrary-depth, bounded-width architectures are sufficient to approximate any continuous nonlinear operator, with explicit construction for nonpolynomial and even polynomial activation functions (Yu et al., 2021). This depth–width trade-off further suggests the parameter efficiency of deep, narrow operator neural networks in scientific machine learning.

2. Architectures and Variants: DeepONet, FNO, GNO, RBON, ORNN

The leading operator neural network architectures are:

Deep Operator Network (DeepONet): Encodes the input function at fixed sensors through a branch net and outputs basis functions via a trunk net; these are multiplied to produce the solution field at arbitrary output points. DeepONet strictly realizes the nonlinear universal approximation theorem for operators and generalizes to arbitrary domains and parametric families (Goswami et al., 2022, Zhang et al., 2023, Yu et al., 2021).
Fourier Neural Operator (FNO): Employs a spectral convolution in Fourier space, learning a parametric kernel for the operator mapping; this provides efficient resolution independence for problems on structured grids (Goswami et al., 2022).
Graph Neural Operator (GNO): Implements message-passing and kernel-based nonlocal interactions on unstructured meshes and graphs, generalizing operator learning to arbitrary geometries (Goswami et al., 2022).
Radial Basis Operator Network (RBON): Uses a single radial-basis hidden layer with branch and trunk subnets built on Gaussian kernels, producing a Kronecker product of features (Kurz et al., 6 Oct 2024). RBON is analytically solvable by direct linear least-squares and matches or improves upon DeepONet and FNO in both in-distribution and out-of-distribution generalization.
Operator Recurrent Neural Network (ORNN): Recursively applies linear operators multiplicatively within hidden state updates, promoting low-rank structure and analytic regularization for inverse problems (Hoop et al., 2019).

Recent extensions include dual-path operator blocks (DPNO) that parallelize the flow of features along both additive (ResNet) and concatenative (DenseNet) pathways, boosting expressivity for multi-scale problems and providing parameter-efficient compositions (Wang et al., 17 Jul 2025).

3. Training Methodologies, Loss Strategies, and Derivative Information

Training operator neural networks typically involves minimizing data-driven losses between predicted outputs and ground-truth samples across families of input functions. For parametric PDE surrogates, this requires sampling initial condition (or coefficient) functions, discretizing them at sensors, and evaluating solutions at query points. Derivative-informed approaches—such as DINO (O'Leary-Roseberry et al., 2022) and DE-DeepONet (Qiu et al., 29 Feb 2024)—incorporate Jacobian or Fréchet derivative losses into the objective, enabling higher accuracy in both value and sensitivity predictions and crucially supporting applications in optimization, inverse problems, and optimal experimental design.

Dimension reduction methods—active subspaces, principal component analysis, Karhunen–Loève expansions—are used to encode high-dimensional input functions in a computationally tractable subspace, reducing the number of required samples and enabling efficient derivative computation. Matrix-free, randomized SVD, and compressed Jacobian representations allow for scalable training even with millions of input/output degrees of freedom.

Physics-informed operator neural networks augment the standard data loss with PDE residuals, variational energies, or constraint penalties, enforcing thermodynamic, dynamical, or conservation laws in the learned surrogates (Zhang et al., 2023, Goswami et al., 2022).

4. Operator Learning for Parametric and Time-Dependent PDE Families

Operator neural networks can learn solution operators for entire families of PDEs—across variations in boundary conditions, coefficients, initial data, or even governing equations—using a single trained surrogate that generalizes well to unseen input instances (Zhang et al., 2023, Zhang, 3 Apr 2024, Berner et al., 12 Jun 2025). For time-dependent operators, evolutionary extensions such as Energy-Dissipative Evolutionary DeepONet propagate network parameters through time via least-squares ODE flows, ensuring discrete energy dissipation and accurate long-term integration for classes of gradient-flow PDEs (Zhang et al., 2023).

Neural ODE operator networks (NODE-ONet) formalize the time evolution of the solution in a latent ODE, with physics-encoded vector fields that preserve key PDE structure and allow generalization over longer time horizons and related dynamical systems (Li et al., 17 Oct 2025).

5. Resolution Independence, Multi-Operator Models, and Compression

Resolution-independent operator learning frameworks—such as RI-DeepONet and RINO—use dictionary learning algorithms to discover continuous basis functions (parameterized by neural implicit representations) for both input and output spaces, enabling the operator network to process arbitrarily sampled functions and entirely unstructured data without architectural changes (Bahmani et al., 17 Jul 2024). This is crucial for robustness in scientific applications, mesh refinement, and transfer across domains.

Multi-operator learning (MOL) distributes the learning capacity across multiple related operators by decoupling input function encoding from operator-specific output bases, allowing efficient training of compact models that generalize across analogous solution families (Zhang, 3 Apr 2024).

Operator compression techniques train neural networks to emulate the mapping from multiscale PDE coefficients to low-dimensional surrogate operators, vastly accelerating simulation compared to classical upscaling and enabling direct surrogate assembly for rapid finite element solution (Kröpfl et al., 2021).

6. Theoretical Foundations, Stability, and Iterative Operator Methods

Operator neural networks can be rigorously analyzed using Banach-space fixed point theory, contraction mappings, and iterative methods. Continuous architectures that iterate operator blocks (Picard-style) inherit geometric convergence guarantees, unique fixed-point solutions, and the ability to stabilize deep learning models via explicit iteration (Zappala et al., 2023). This perspective has been applied to diffusion models (as reverse SDE operator iterations), protein structure recycling (AlphaFold as compositional operator iteration), and iterative graph neural networks for long-range dependency problems.

Regularization, orthonormalization, and sparsity penalties are essential for generalization and stability in high-dimensional operator neural networks, with explicit error bounds, concentration results, and covering number estimates (Hoop et al., 2019, Lee et al., 2023).

7. Applications and Empirical Performance Across Scientific Domains

Operator neural networks have demonstrated state-of-the-art performance in computational mechanics, fluid dynamics, solid mechanics, climate modeling, symbolic regression, and inverse boundary-value problems. Key empirical performance highlights:

DeepONet and FNO achieve sub-1% relative errors across Darcy flow, Navier-Stokes, and Burgers’ equation benchmarks, generalizing to high Reynolds numbers, complex geometries, and varying coefficient fields (Goswami et al., 2022, Zhang et al., 2023, Wang et al., 17 Jul 2025).
Resolution-independent operator methods perform robustly under mesh adaptation, point cloud sampling, and out-of-distribution evaluation, matching classical surrogates while handling arbitrarily sampled input/output functions (Bahmani et al., 17 Jul 2024).
Multi-operator models (MODNO) deliver significant computational savings while maintaining or exceeding the accuracy of single-operator surrogates, especially under data scarcity scenarios (Zhang, 3 Apr 2024).
Radial Basis Operator Networks maintain $L^2$ relative errors as low as $10^{-5}$ – $10^{-7}$ on both in-distribution and OOD PDE benchmarks, despite extremely compact network sizes (Kurz et al., 6 Oct 2024).
Operator-feature neural networks (OF-Net) support symbolic regression with high skeleton recovery rates and stability across expression lengths via learned operator embeddings (Deng et al., 14 Aug 2024).
Operator recurrent neural networks and compressive surrogates replicate boundary control algorithms for wave-speed inversion, providing analytic guarantees and efficient implementation for inverse problems (Hoop et al., 2019, Kröpfl et al., 2021).

Table: Representative Operator Neural Network Architectures

Architecture	Key Principle	Input Representation
DeepONet (Goswami et al., 2022)	Branch/trunk split, basis expansion	Fixed sensor samples
FNO (Goswami et al., 2022)	Fourier integral layers	Uniform grid, FFT
GNO (Goswami et al., 2022)	Graph kernel message passing	Unstructured mesh
RBON (Kurz et al., 6 Oct 2024)	Radial basis hidden layer + direct solve	Gaussian kernel centers
ORNN (Hoop et al., 2019)	Operator-multiplicative recurrent cell	Linear operator acting on states
RI-DeepONet/RINO (Bahmani et al., 17 Jul 2024)	Dictionary learning of neural bases	Arbitrary point clouds

Outlook and Open Problems

The field of operator neural networks continues to innovate along multiple directions: scalable derivative-informed training for high-dimensional settings (O'Leary-Roseberry et al., 2022), stability guarantees for deep operator stacks (Zappala et al., 2023), distributed architectures for multi-operator/foundation models (Zhang, 3 Apr 2024), physics-informed losses for thermodynamic constraints (Zhang et al., 2023), and robust surrogate modeling for scientific and engineering applications. Open challenges include theoretical analysis of expressive power in dual-path architectures (Wang et al., 17 Jul 2025), efficient handling of complex and adaptive geometries, uncertainty quantification, and extension of operator learning techniques to fully generative and symbolic domains.