Physics-Informed DeepONet
- Physics-Informed DeepONet is a neural operator architecture that incorporates PDE constraints as soft or hard penalties in the loss function to learn mappings between infinite-dimensional spaces.
- It utilizes a branch-trunk network design to approximate operators, enabling rapid, generalizable predictions without requiring paired training data.
- By embedding physical laws in its loss formulation, the method achieves high accuracy and efficiency for solving nonlinear PDEs in scientific and engineering applications.
Physics-Informed DeepONet
Physics-Informed Deep Operator Networks (PI-DeepONets) are a class of neural operator architectures that embed known physical laws within the DeepONet framework to approximate nonlinear solution operators of parametric partial differential equations (PDEs) without requiring paired training data. By explicitly incorporating the underlying PDE constraints as a soft or hard penalty in the loss function, PI-DeepONets can learn mappings between infinite-dimensional Banach spaces with strong generalization to new input conditions and enforce physical consistency in their predictions. This approach is particularly suitable for scientific and engineering problems where solution gradients, boundary behaviors, and operator family parameterizations are difficult to capture via conventional data-driven machine learning models.
1. Operator Learning Framework and Network Architecture
In PI-DeepONet, the objective is to approximate a solution operator
where satisfies a nonlinear parabolic PDE,
with homogeneous initial and boundary conditions
The function is a parametrized source, and is a prescribed nonlinear diffusion function.
The PI-DeepONet employs the standard DeepONet architecture:
- Branch net : Input is the vector of source-term values at fixed sensor points; output is a -dimensional latent embedding.
- Trunk net : Input is the query spatio-temporal location ; output is a -dimensional latent embedding.
- The operator approximation is the inner product: Experimental configurations typically utilize branch sensors, both nets as fully connected MLPs with ReLU activations, and 2-dimensional trunk inputs (Sevcovic et al., 2023).
2. Physics-Informed Loss Formulation
Instead of conventional supervised losses relying on paired training data, PI-DeepONet training enforces the PDE physics and boundary/initial conditions through a composite loss: where:
- Physics loss
with
Here, are collocation points inside the domain.
- Operator loss (enforcing initial and boundary conditions)
where are on the domain boundary or initial slice.
Automatic differentiation is leveraged to compute all necessary partial derivatives directly through the neural network representations (Sevcovic et al., 2023, Wang et al., 2021).
3. Data Generation, Training Strategies, and Hyperparameters
PI-DeepONet training leverages a sampled family of input source functions , typically drawn from zero-mean Gaussian processes with a specified kernel (e.g., squared-exponential, ), evaluated at discrete sensor locations. For each source realization:
- input functions,
- boundary/initial sensor points,
- uniformly sampled interior collocation points.
There is no dependency on labeled solution data in the interior: the only labels are zero initial and boundary values. The network is trained using Adam with a learning rate , up to 10,000 full-batch gradient steps, until the loss stabilizes (Sevcovic et al., 2023). This structure allows efficient enforcement of the underlying physics without requiring extensive paired datasets (Wang et al., 2021).
4. Performance Metrics and Generalization
The evaluation of PI-DeepONet centers on the norm difference between predicted solutions and reference finite-difference (FDM) or standard numerical solutions:
- On held-out test sources, PI-DeepONet attains relative errors of , reproducing even highly transient features with close correspondence to FDM results, and with drastically reduced runtime (fractions of a second vs. many seconds for FDM).
- PI-DeepONet demonstrates "amortized" prediction: the trained network generalizes to previously unseen without need for retraining or additional labels.
- These empirical results confirm strong operator-learning and out-of-distribution generalization, enabling rapid and robust solution queries for new parameters (Sevcovic et al., 2023, Wang et al., 2021).
5. Advantages, Limitations, and Extensions
The table below summarizes the core contributions, strengths, limitations, and suggested extensions of PI-DeepONet as articulated in (Sevcovic et al., 2023):
| Aspect | Main Points |
|---|---|
| Contribution | First PI-DeepONet for fully nonlinear parabolic PDE from stochastic control; demonstrates single-network operator inference |
| Strengths | No need for labeled pairs; strong physical regularization; fast inference for new |
| Limitations | Accuracy depends on sensor placement and collocation strategy; nonlinear PDE error bounds remain empirical; architecture tuning is problem-specific |
| Key Extensions | Non-homogeneous/complex BC and IC; alternative nonlinearities ; higher spatial dimensions; hybrid supervised+PI training; adaptive collocation |
PI-DeepONet can be adapted to wide classes of nonlinear and higher-dimensional PDEs, as well as in hybrid regimes combining a small set of supervised training samples with physics-based physical loss terms (Sevcovic et al., 2023, Wang et al., 2021).
6. Relationship to Broader Operator Learning and Scientific ML
PI-DeepONet is situated at the intersection of operator learning and physics-informed machine learning, offering:
- "Operator amortization": once trained, the solver rapidly produces solutions for arbitrary parameterizations , solving entire families of PDEs with minimal additional computation (Wang et al., 2021).
- Data efficiency: even in the extreme case of zero interior solution data, physics-informed regularization yields errors 1–2 orders of magnitude lower (in some benchmarks) than purely data-driven DeepONets, and requires an order of magnitude fewer training samples for comparable accuracy.
- Extensibility: the approach accommodates boundary and initial condition enforcement, complex nonlinearities, and blends with traditional numerical and scientific computing methodologies for improved robustness and fidelity (Wang et al., 2021).
7. Outlook and Future Work
Open research directions include rigorous a priori error estimation for nonlinear operators, dynamic or learnable sensor/collocation schemes, integration with adaptive curriculum learning strategies, and application to higher-dimensional or multiphysics problems (e.g., stochastic control, fluid-structure interaction). Empirical observations indicate substantial potential for hybrid strategies, using small amounts of labeled data to further enhance accuracy and transferability of physics-informed operator networks (Sevcovic et al., 2023, Wang et al., 2021).