Fractional Physics-Informed Neural Networks

Updated 6 May 2026

fPINNs are neural networks that incorporate fractional calculus to solve fractional differential equations exhibiting nonlocality and memory effects.
They embed discretized fractional operators, such as Caputo and Riesz derivatives, directly into the loss function to ensure accurate PDE constraint satisfaction.
fPINNs facilitate inverse problems, uncertainty quantification, and domain-specific adaptations through specialized architectures and advanced computational strategies.

Fractional Physics-Informed Neural Networks (fPINNs) generalize the physics-informed neural network (PINN) framework to incorporate fractional calculus, enabling the data-driven solution and parameter inference for fractional differential equations (FDEs) that encode nonlocality and memory effects in physical, biological, and engineering systems. fPINNs extend classical PINNs by embedding discretized or otherwise efficiently computable fractional operators—typically the Caputo or Riesz/Laplacian types—directly into the loss functional, ensuring that neural network predictions satisfy FDE constraints even in high-dimensional, data-limited, and ill-posed or inverse settings.

1. Mathematical Foundations: Fractional Operators and Their Discretization

Fractional derivatives model phenomena with anomalous transport, nonlocality, or temporal memory, necessitating noninteger-order differentiation operators in both time and space. The Caputo time-fractional derivative of order $0 < \alpha < 1$ ,

${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$

and the Riesz fractional Laplacian (space-fractional derivative) in $\mathbb{R}^d$ ,

$(-\Delta)^{\alpha/2} u(x) = C_{d,\alpha}\;\mathrm{P.V.}\int_{\mathbb{R}^d} \frac{u(x) - u(y)}{\|x-y\|^{d+\alpha}}\,dy,$

define the two principal nonlocal operators in fPINN frameworks (Pang et al., 2018, Guo et al., 2022). Accurate neural imposition of these constraints requires discretizations tailored to each operator:

Caputo time-fractional derivatives are approximated via L1-type finite difference quadratures, e.g.,

$D_t^\alpha u^n \approx \frac{1}{\Delta t^\alpha \Gamma(2-\alpha)}\sum_{j=0}^{n-1} b_j(u_{n-j} - u_{n-j-1}),$

with $b_j = (j+1)^{1-\alpha} - j^{1-\alpha}$ (Thakur et al., 2024, Zinihi, 26 Sep 2025).

Riesz/Laplacian operators employ either shifted Grünwald-Letnikov (GL) differences, Monte Carlo integral estimators, or pseudo-differential (Fourier) representations (Pang et al., 2018, Guo et al., 2022, Gracyk, 16 Feb 2026).

Alternative approaches include conformable derivatives with local structure (Ye et al., 2021), or operational-matrix discretizations for efficient matrix-vector evaluation in temporally nonuniform or delay/fractional DAE systems (Taheri et al., 2024).

2. fPINN Architectures, Loss Functionals, and Training Procedures

In fPINNs, the physical field $u(x,t)$ is parameterized via a neural network $u_{NN}(x,t;\theta)$ , with architecture choices ranging from standard fully connected MLPs to spectral or Legendre Neural Block (LNB) enhancements (Taheri et al., 2024):

Inputs: physical coordinates and/or time
Hidden layers: smooth activations (tanh, swish, sine) preferred due to better approximation of nonlocal functional regularity
Output: scalar (field value) or vector (multistate models) predictions

The composite loss functional is constructed as

$\mathcal{L}(\theta) = \mathcal{L}_{\text{PDE}} + \mathcal{L}_{\text{IC/BC}} + \mathcal{L}_{\text{data}},$

where

$\mathcal{L}_{\text{PDE}}$ penalizes residuals of the FDE at collocation points, where fractional derivatives are evaluated by discretization or stochastic/quadrature approximations,
${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 0 enforces initial/boundary conditions, often via hard constraints or penalty terms,
${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 1 (if present) fits empirical observations for state and/or parameters.

For inverse problems or parameter discovery, the fractional orders ( ${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 2, ${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 3) and physical coefficients are embedded as trainable scalars, with constraints (e.g., positivity via softplus, fractional order in ${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 4 via sigmoid reparameterization) ensuring physical meaning (Zinihi, 26 Sep 2025, Daryakenari et al., 2024, Thakur et al., 2024).

Common optimizers are Adam or L-BFGS, with staged or hybrid schedules to mitigate local minima and balance data-fidelity versus PDE residuals.

3. Computational Strategies for Nonlocal Fractional Operators

Computation of nonlocal operators in high dimensions presents the principal challenge for fPINNs, addressed by a range of strategies:

Finite Difference Quadratures are suitable for low-dimensional or regular-grid settings, but require multiple (history/auxiliary) points, elevating computational and memory demands (Pang et al., 2018, Taheri et al., 2024).
Monte Carlo Estimation (MC-fPINN): Mesh-free unbiased estimators of the Riesz Laplacian (and Caputo derivatives) are constructed by splitting integrals and drawing i.i.d. samples of direction, radius, and time (Guo et al., 2022, Sheng et al., 2024, Hu et al., 2024). Computational scaling is ${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 5 per collocation point, independent of spatial dimension ${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 6, enabling problems in ${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 7 dimensions (Hu et al., 2024). However, high variance necessitates large sample sizes or variance-reduction techniques.
Quadrature-Enhanced MC-fPINN: Recent improvements replace the radial/time MC integrals with deterministic (e.g., Gauss–Jacobi, Gauss–Laguerre) quadrature, lowering estimator variance, eliminating delicate radius cutoffs and improving convergence rate and speed (Hu et al., 2024, Li et al., 13 Jun 2025).
Fourier/Pseudo-Differential Enhancement: For periodic or regular domains, fPINN losses are augmented with residuals in the Fourier domain, leveraging the diagonalization of the fractional operators and improving learning of high-frequency modes (Gracyk, 16 Feb 2026). Monte Carlo Fourier approximations allow mesh-agnostic implementation.

These advances collectively address the curse of dimensionality, a critical bottleneck in traditional mesh-based FDE solvers (Hu et al., 2024, Sheng et al., 2024, Guo et al., 2022, Taheri et al., 2024).

4. Specialized fPINN Extensions and Domain-Specific Adaptations

Recent work has developed domain-specific fPINN variants and workflows:

Score-fPINN: For high-dimensional Fokker-Planck-Lévy equations, the introduction of a fractional score function enables transformation to a second-order PDE system without explicit fractional Laplacian, facilitating mesh-free high- ${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 8 PINN training. Both fractional score matching (when conditional densities are known) and PINN-based score estimation (when not) are supported (Hu et al., 2024).
Stochastic Fractional PDEs (BO-fPINN): Integration of the bi-orthogonal expansion with PINN learning provides a robust mechanism for representing randomness in SPDEs with fractional operators, handling eigenvalue crossings and inverse settings efficiently (Ma et al., 2023).
Transformed fPINNs for Diffusion-Wave Equations: Integration-by-parts transformations of the Caputo derivative enable efficient training for time-fractional diffusion-wave PDEs with ${}^C D_t^\alpha u(t) = \frac{1}{\Gamma(1-\alpha)} \int_0^t (t-\tau)^{-\alpha} u'(\tau)\,d\tau,$ 9, reducing the computational cost of fractional derivative evaluation (Li et al., 13 Jun 2025).
Adaptive/Multiprecision/Mesh Strategies: Alikhanov-XfPINNs combine high-order temporal discretization, nonuniform meshes, and adaptive activations to resolve initial singularities and audit temporal discretization error in nonlinear fPDEs (Dwivedi et al., 2 May 2026); multiprecision and multistage training counteract the numerical loss-of-significance and optimization plateaus of deep fractional residuals (Xue et al., 28 May 2025).
Laplace-fPINNs: Formulation in the Laplace-transformed domain enables the solution of subdiffusion problems in high dimensions, avoiding direct time discretization and employing only integer-derivative AD (Yan et al., 2023).

5. Inverse Problems, Parameter Estimation, and Uncertainty Quantification

fPINNs seamlessly integrate inverse solvers and parameter estimation, including fractional orders, transport/diffusion rates, source terms, kinetic parameters, and even functional dependencies:

The fractional order ( $\mathbb{R}^d$ 0 or $\mathbb{R}^d$ 1) is treated as a differentiable, trainable parameter, often with bounded support via nonlinear parametrization (Zinihi, 26 Sep 2025, Thakur et al., 2024, Daryakenari et al., 2024).
Compartmental pharmacokinetics, viscoelasticity, and epidemiology are modeled by embedding the respective FDE system, with fPINNs jointly learning the governing rates and memory effects from (possibly noisy, sparse) time-series data (Daryakenari et al., 2024, Thakur et al., 2024, Zinihi, 26 Sep 2025).
Inverse source problems utilize dual networks for state and source, with MC-based fractional derivative evaluation and provable error bounds in high dimension (Sheng et al., 2024).
Uncertainty quantification is addressed either via stochastic expansions (BO-fPINN) or parameter-space extensions (Ma et al., 2023, Guo et al., 2022).

Robustness under noise and misspecification is consistently observed, with parameter errors typically below 1–2% in benchmark cases, and networks outperforming classical two-step or optimization-based procedures for joint recovery (Thakur et al., 2024, Daryakenari et al., 2024).

6. Performance Scaling, Numerical Results, and Best Practices

Numerical evidence across multiple studies demonstrates the scalability and versatility of fPINNs:

MC-fPINN and quadrature-enhanced variants have been validated on Poisson/tempered fractional diffusion problems up to 100,000 dimensions, showing relative $\mathbb{R}^d$ 2 errors of $\mathbb{R}^d$ 3– $\mathbb{R}^d$ 4 and only sublinear slow-down with increasing dimension (Hu et al., 2024).
For 1D–3D FDEs, standard fPINN schemes deliver relative errors of $\mathbb{R}^d$ 5– $\mathbb{R}^d$ 6, with multistage, multiprecision, and adaptive activation yielding improvements to $\mathbb{R}^d$ 7– $\mathbb{R}^d$ 8 (Xue et al., 28 May 2025, Dwivedi et al., 2 May 2026).
Special architectures (pseudo-differential enhancement, LNB) accelerate convergence, mitigate spectral bias, and aid in learning high-frequency solution components (Gracyk, 16 Feb 2026, Taheri et al., 2024).
Weighted PINN losses and data scaling improve fitting near singularities, and adaptive collocation or residual-based point redistribution enhance convergence for complex dynamics (Ye et al., 2021, Li et al., 13 Jun 2025).
For stochastic or inverse settings, fPINN-based uncertainty quantification and transfer learning deliver efficiency and robustness across varying FDE parameters (Ma et al., 2023).

7. Limitations, Open Problems, and Future Directions

Despite their generality and mesh-free capacity, fPINNs face several challenges:

Computational burden for large auxiliary/history points in classical (non-MC) fPINNs in high $\mathbb{R}^d$ 9.
Variance of MC estimators in high-dimension/low-collocation regimes, necessitating advanced variance-reduction, adaptive quadrature, or hybrid techniques (Guo et al., 2022, Hu et al., 2024).
Stability and convergence guarantees are largely empirical; theoretical underpinnings for generalization, error decomposition (approximation vs. statistical), and efficiency remain active research areas (Sheng et al., 2024).
Scalability of architectures in $(-\Delta)^{\alpha/2} u(x) = C_{d,\alpha}\;\mathrm{P.V.}\int_{\mathbb{R}^d} \frac{u(x) - u(y)}{\|x-y\|^{d+\alpha}}\,dy,$ 0 and for complex coupled multiphysics FDEs; further integration with operator-learning, graph-based, or domain-specific network designs may enhance representation capacity.
Ill-posedness in inverse or data-driven FDEs, especially with non-Gaussian noise or sparse observation, may require explicit regularization, advanced uncertainty quantification, and physically informed priors.

A plausible implication is that future developments will combine mesh-free fractional operator evaluation, operator learning frameworks, and scalable uncertainty-quantified inference, positioning fPINNs as a central tool in scientific machine learning for nonlocal dynamical systems.