Operator Learning Frameworks
- Operator learning frameworks are data-driven models that approximate mappings between infinite-dimensional function spaces, essential for solving parametric PDEs and dynamical systems.
- They integrate diverse methods such as DeepONet, Fourier Neural Operators, kernel approaches, and transformer-based architectures to ensure discretization invariance and universal approximation.
- These frameworks combine physics-informed training, optimal recovery techniques, and uncertainty quantification to achieve robust, scalable, and efficient computational solutions.
Operator learning frameworks are architectures and methodologies for approximating operators—mappings between infinite-dimensional function spaces—using data-driven models, typically to solve parametric partial differential equations (PDEs), dynamical systems, or related functional relationships. Unlike classical machine learning, which focuses on function approximation in finite-dimensional settings, operator learning tackles the fundamentally infinite-dimensional nature of mappings such as solution operators for PDEs: 𝒢:𝒰→𝒱, where 𝒰 and 𝒱 are function spaces. Recent research has produced a diverse ecosystem of operator-learning frameworks spanning neural networks, kernel methods, probabilistic approaches, and hybrid systems, each with distinct mathematical properties, expressivity, computational trade-offs, and theoretical guarantees.
1. Operator Learning Problem Formulation and Mathematical Foundations
The goal of operator learning is to construct a parametric map (often denoted 𝒢_θ) that approximates a target operator 𝒢:𝒰→𝒱 (e.g., a PDE solution map), given a set of input-output function pairs {(um, vm=𝒢(um))}. These spaces are commonly Banach or Hilbert spaces of functions, and supervised training seeks to minimize empirical losses of the form:
Key settings include:
- Input/output discretization: Training and prediction are performed via functional evaluations at sensor points; architectures must generalize across discretizations.
- Problem classes: Operator learning applies to forward/solution operators of PDEs, inverse problems, stochastic processes, and optimal control (Mollaali et al., 2023, Boullé et al., 2023, Yamazaki et al., 2024, Nelsen et al., 27 Aug 2025, Hwang et al., 2021).
The mathematical structure of 𝒰 and 𝒱, e.g., their RKHS or Sobolev space properties, strongly informs the selection of approximation tools (kernel methods, neural nets) and the error analysis (Batlle et al., 2023, Yang et al., 14 Sep 2025).
2. Principal Operator Learning Frameworks and Architectures
Deep Neural Operator Networks
- DeepONet: Employs a branch–trunk architecture: the branch network encodes the function u sampled at sensors; the trunk network encodes spatial or temporal coordinates. Outputs are weighted sums of trunk basis functions modulated by branch coefficients, yielding a flexible universal operator approximation (Mollaali et al., 2023, Boullé et al., 2023).
- Fourier Neural Operator (FNO): Applies convolution in the spectral domain, leveraging the translation-invariant structure of many PDE kernels. Layers alternate between global spectral convolutions and local pointwise nonlinearity (Boullé et al., 2023, Mollaali et al., 2023).
- MONet/MNO: Multi-operator networks generalize DeepONet to families of operators, employing an explicit branch for parameterized operator descriptors and achieving universal approximation for operator-valued maps; quantitative scaling laws and parameterization order are rigorously analyzed (Weihs et al., 29 Oct 2025).
- Transformer-based frameworks: Σ-Attention for quantum self-energy, and geometry-independent cardiac models employ transformer blocks to aggregate patch-wise features or system observables for operator predictions (Zhu et al., 20 Apr 2025, Zhou et al., 1 Dec 2025).
- Physics-informed operator learning: Embeds PDE structure, weak forms, or fractional calculus into the loss or architecture, as with the physics-guided bi-fidelity Fourier-featured DeepONet (Mollaali et al., 2023), FEM-informed operator learning (Yamazaki et al., 2024), or fPINN-DeepONet for time-fractional PDEs (Lu et al., 15 May 2026).
Kernel-Based and Probabilistic Operator Learning
- Kernel operator learning: Constructs operator approximations in vector-valued RKHS using optimal recovery theory. Kernel ridge regression is used for training, and operator-valued interpolation theory provides a priori and a posteriori error bounds (Batlle et al., 2023, Yang et al., 14 Sep 2025).
- Gaussian process operator learning: Approximates the associated real-valued bilinear form of the operator using a GP, allowing for analytic uncertainty quantification and efficient kernel mean function incorporation (including neural operator means) (Mora et al., 2024).
- Polynomial chaos expansion (PCE): Represents solution operators as expansions in stochastic polynomial bases, with analytic formulas for mean and variance; highly efficient for moderate input dimension (Sharma et al., 28 Aug 2025).
Representation-Equivalence and Discretization-Invariance
- Representation Equivalent Neural Operators (ReNO): Enforces layerwise analysis–discrete–synthesis commutativity via frames, ensuring that the discrete model realizes the same continuous operator regardless of grid or sensor choices, thereby eliminating aliasing errors (Bartolucci et al., 2023).
- Domain-Unification-Free Operator (UFO) Framework: Achieves discretization decoupling—arbitrary input and output grids—via cross-domain representations and phase-modulated coupling between spectral and spatial representations (Qiao et al., 12 May 2026).
3. Training Methodologies, Data Regimes, and Loss Functions
- Supervised (data-driven) operator learning: Relies on labeled solution data; losses include (relative) L2/MSE, physics constraints, and Sobolev norms (Boullé et al., 2023, Mollaali et al., 2023).
- Physics-informed or unsupervised operator learning: Incorporates weak-form PDE discretizations in the loss (FEM-based operator learning (Yamazaki et al., 2024)), reducing training data requirements and obviating the need for automatic differentiation through stiff spatial operators.
- Bi-fidelity and transfer learning frameworks: Leverage combinations of low- and high-fidelity data, or adapt operators across source and target domains/subspaces using fusion frames, subspace-wise POD, or residual operator learning (Mollaali et al., 2023, Jiang et al., 2024).
- Stochastic and probabilistic regimes: Losses may be defined over function space distributions (KL divergence, Bayesian variational principles), with explicit treatment for function-valued posteriors and plug-and-play denoising (Nelsen et al., 27 Aug 2025).
4. Theoretical Guarantees and Approximation Theory
- Universal approximation theorems: Most major architectures (DeepONet, FNO, MONet, MNO) possess universal operator approximation properties for broad operator classes—continuous, integrable, or Lipschitz—quantified in Lp or uniform norms (Weihs et al., 29 Oct 2025, Bayraktar et al., 10 Nov 2025).
- Explicit scaling laws: Precise rates as a function of operator smoothness, input/output discretization, and neural/polynomial network size have been established (e.g., double-log scaling for general multi-operator learning, polynomial for single operators, curse-of-dimensionality quantified) (Weihs et al., 29 Oct 2025, Batlle et al., 2023, Sharma et al., 28 Aug 2025).
- Kernel and RKHS theory: Error decompositions via optimal recovery and Mercer kernels yield convergence rates tied to kernel eigenvalue decay and measurement fill distance; dimension-independence is provable under appropriate conditions (Batlle et al., 2023, Yang et al., 14 Sep 2025).
- Representation-equivalence: Provides algebraic and operator-norm guarantees that the learned finite-resolution discrete operator genuinely represents the underlying continuous map, ensuring discretization-invariance and structural fidelity (Bartolucci et al., 2023).
5. Empirical Benchmarks and Application Domains
Operator learning frameworks have demonstrated competitive or superior accuracy and computational efficiency in a range of PDE benchmarks, including (relative L2 errors, representative):
- 1D Burgers: Kernel (2.15%), DeepONet (2.15%), FNO (1.93%) (Batlle et al., 2023)
- 2D Navier–Stokes: Kernel (0.12%), DeepONet (3.63%), FNO (0.26%) (Batlle et al., 2023)
- Cardiac electrophysiology: Vision-transformer operator ≤5.1 ms error vs. FNO (10.1 ms) and DeepONet (7.1 ms) (Zhou et al., 1 Dec 2025)
- Parametric PDEs: MONet/MNO reduce OOD errors compared to DeepONet baselines across conservation, diffusion-reaction, Klein–Gordon, and wave equations (Weihs et al., 29 Oct 2025)
- Fractional PDEs, stochastic parameterized PDEs, and inverse design and control illustrate versatility (Lu et al., 15 May 2026, Mora et al., 2024, Hwang et al., 2021, Shatarah et al., 7 Jul 2025).
A summary table from benchmark studies:
| Method | Relative Error (Burgers) | Relative Error (Navier–Stokes) | UQ Provided |
|---|---|---|---|
| DeepONet | 2.15% | 3.63% | No |
| FNO | 1.93% | 0.26% | No |
| Kernel | 2.15% | 0.12% | Yes |
| PCE | 3.4e-7 (MSE) | — | Yes |
| GP (with NN mean) | 0.08% (Burgers) | — | Yes |
6. Extensions, Strengths, and Open Challenges
- Discretization and mesh invariance: Methods such as ReNO, UFO, and kernel-based approaches with hierarchical/nested kernels support flexible discretization and transfer across grids or irregular domains (Bartolucci et al., 2023, Qiao et al., 12 May 2026, Batlle et al., 2023).
- Transferability and generalization: Fusion frames and subspace-wise POD facilitate robust transfer learning, reweighting subspaces under domain shift (Jiang et al., 2024).
- Uncertainty quantification: Kernel, GP, and PCE frameworks provide analytic UQ, essential for safety-critical or data-scarce applications (Mora et al., 2024, Sharma et al., 28 Aug 2025, Batlle et al., 2023).
- Physics and structure preservation: Embedding physical knowledge—PDE structure, conservation laws, periodicity—yields sample-efficient and physically consistent surrogates (Mollaali et al., 2023, Yamazaki et al., 2024).
- Computational scalability and efficiency: PCE and kernel methods excel in moderate-dimensional settings or with limited data; deep operator networks scale better for large, high-dimensional domains, at the cost of more complex training (Batlle et al., 2023, Sharma et al., 28 Aug 2025, Mora et al., 2024).
- Limitations and open directions: Neural operator frameworks may require substantial architecture/hyperparameter tuning, and aliasing can degrade generalization if not controlled (Bartolucci et al., 2023). Extensions to fully high-dimensional, highly nonlinear, spatially heterogeneous, or stochastic operator families continue to push the boundaries of existing theory and implementation strategies (Lu et al., 15 May 2026, Weihs et al., 29 Oct 2025, Yang et al., 14 Sep 2025).
7. Practical Implementation and Guidelines
- Architecture selection: Balance branch, trunk, and parameter subnetwork complexity according to the dominant approximation difficulty in the operator (Weihs et al., 29 Oct 2025).
- Training: Employ regularization, early stopping, data augmentation, and, where necessary, cross-domain or representation-equivalence penalties to target generalization and robustness (Bartolucci et al., 2023, Mollaali et al., 2023).
- Data regime matching: Kernel/Gaussian process/PCE frameworks favor data-scarce, smooth operator regimes, while neural operator and transformer-based methods target large-scale, complex, and high-dimensional domains (Batlle et al., 2023, Mora et al., 2024, Sharma et al., 28 Aug 2025, Zhou et al., 1 Dec 2025).
- Open research areas: Adaptive discretization, mesh-invariant representations, multi-fidelity and federated operator learning, and principled UQ and generalization bounds remain active topics (Mollaali et al., 2023, Qiao et al., 12 May 2026, Jiang et al., 2024).
Operator learning frameworks thus provide a powerful, theoretically grounded, and increasingly versatile toolkit for scientific machine learning, with demonstrated impact across physical modeling, control, uncertainty quantification, and design automation (Mollaali et al., 2023, Jiang et al., 2024, Weihs et al., 29 Oct 2025, Batlle et al., 2023, Sharma et al., 28 Aug 2025).