Score-fPINN: Physics-Informed Score Learning
- Score-fPINN is a framework that integrates score-based statistical modeling with physical priors from Fokker-Planck and Fokker-Planck–Lévy equations.
- It employs hybrid objectives combining score matching and PINN residual supervision to robustly approximate time-dependent stochastic dynamics in high dimensions.
- The method mitigates issues like the curse of dimensionality and numerical underflow, advancing likelihood estimation and uncertainty quantification in complex SDEs.
Score-fPINN (Score-based Physics-Informed Neural Network) is a framework that integrates score-based statistical modeling with physical priors derived from the Fokker-Planck (FP) and Fokker-Planck–Lévy (FPL) equations into neural network training objectives. By enforcing PDE residuals related to the evolution of log-densities or their gradients (“scores”), Score-fPINN achieves numerically robust and scalable approximation of high-dimensional time-dependent stochastic dynamics, overcoming curse-of-dimensionality challenges and numerical underflow typical for direct density-based methods (Lai et al., 2022, Hu et al., 2024, Hu et al., 2024).
1. Mathematical Foundations
Score-fPINN is built on stochastic differential equations (SDEs) whose solutions’ time-dependent densities, , evolve according to FP or FPL-type PDEs. For a standard Itô SDE on :
the Fokker-Planck equation reads
For -stable Lévy-driven processes,
the forward FPL equation incorporates the fractional Laplacian:
Defining the fractional score,
allows a transformation of the FPL into a standard second-order PDE for , circumventing direct computation of exponentially small or nonlocal densities (Hu et al., 2024). For (pure Brownian), this reduces to the classical score 0.
The score evolution is governed by a score PDE, derived by differentiating the Fokker-Planck equation in space and recasting in terms of the score function:
1
with 2 (Hu et al., 2024).
2. Score Learning and Physics-Informed Losses
Score-fPINN interlaces two core concepts:
- Score-based modeling: neural networks parameterize the family of time-conditional scores, trained to approximate 3.
- PINN-residual supervision: physical constraints, i.e., enforcing that network outputs respect the score PDE.
Multiple loss functions are used:
- Score Matching (SM): minimize MSE between 4 and 5, using known transition kernels 6 where available.
- Sliced Score Matching (SSM): directly minimize a divergence-regularized loss, estimating divergences via Hutchinson’s estimator, requiring only samples from 7.
- Score-fPINN residual: penalize deviation from the score PDE at collocation points; for the fractional case, the network is trained to satisfy the transformed score-PDE using automatic differentiation.
Hybrid objectives combine the data-driven loss (e.g., denoising score matching) with the PINN-style PDE residual, typically:
8
where 9 balances fit and physics.
3. Implementation: Network Architecture and Training
Score-fPINN utilizes feedforward neural networks:
- Score network: 0 or 1, with 4 layers, width 128–512, and activation (typically tanh).
- Log-likelihood network: 2 approximates 3, with a similar structure.
- Input encoding: 4 and 5 are concatenated; outputs are vector-valued (score) or scalar (LL).
- Hard-constraint initialization: networks may be constructed so that 6 and 7 by design.
Sampling and collocation details:
- Residual points 8 are generated by simulating SDE trajectories (Euler–Maruyama) or from known transitions.
- The Huber (smooth 9) loss is often used instead of 0 to robustify against rare events in heavy-tailed processes (Hu et al., 2024).
Typical hyperparameters include Adam optimizers (lr = 1e-3, decay 0.9/10k epochs), batch sizes of 1,000–10,000 residuals, and training for 10k–100k epochs.
4. Fractional Score-fPINN for FPL Equations
Extending to FPL equations, Score-fPINN incorporates the fractional score 1 (Hu et al., 2024):
- FSM (Fractional Score Matching): When transition distributions are known, 2 can be learned by matching to conditional draws; this is computationally cheap but limited in applicability.
- Score-fPINN (fractional): Uses a two-stage process. First, the vanilla score 3 is estimated; second, the fractional score is fit by enforcing the relevant PDE via PINN residuals, allowing application to general SDEs where transitions are unknown.
After learning 4, the log-likelihood PDE—now a standard second-order PDE without fractional Laplacian—is solved using a dedicated PINN. This separation allows explicit, mesh-free, dimension-robust learning, bypassing the exponential decay of 5 in high dimensions.
5. Comparative Analysis: Efficiency, Accuracy, and Scaling
A comparison of approaches for training the score function and log-likelihood reveals the following (Hu et al., 2024, Hu et al., 2024):
| Method | Applicability | Computational Cost | Fastest Regime | Key Limitation |
|---|---|---|---|---|
| SM / FSM | Transition kernel known | Low | High-dim OU/GBM etc. | Not general SDEs with unknown law |
| SSM | Transition kernel unknown | Moderate | High-dim, moderately general | Slower than SM in simple cases |
| Score-fPINN | Arbitrary SDE, fractional/Brownian | High | Nontrivial drifts, FPL | Higher AD, residual costs |
- For 6, Score-fPINN achieves relative 7 errors 8, with cost and error growing sublinearly in 9. FSM is typically 3–50 faster when applicable.
- Score-fPINN overcomes the “curse of dimensionality” by working with 1-scale score functions and mesh-free residuals. Direct density or log-likelihood-based PINN solvers experience error blow-up due to numerical underflow and the degeneracy of 2 as 3 (Hu et al., 2024).
6. Theoretical Properties and Empirical Results
Score-fPINN has several theoretical and practical advantages:
- Self-consistency: Enforcing the score PDE ensures the learned family of scores is temporally consistent across the entire solution trajectory, not just marginal fits at each 4.
- Likelihood and KL bounds: Reducing the score-PDE residual tightens an upper bound on 5 between joint and learned distributions, implying improved log-likelihood (see Lemma 4.1 and Theorem 4.2 in (Lai et al., 2022)).
- Conservativity: Small PDE residual implies the learned score field is nearly conservative, paralleling properties of true Fokker-Planck solutions.
- Empirical validation: Experiments on high-dimensional SDEs, including up to 6 for both Brownian and Lévy-driven systems, demonstrate stable, accurate estimation of log-likelihoods and densities; Score-fPINN remains effective in heavy-tailed and nonlinear drift regimes (e.g., nonlinear drift/OU process), where FSM is inapplicable.
7. Limitations and Extensions
Notwithstanding its scalability, Score-fPINN is constrained by the computational expense of automatic differentiation for high-order residuals, and by the stochastic simulation of 7-stable Lévy increments (computational bottleneck in FPL contexts). Hybrid techniques, e.g., combining trace estimation and surrogate modeling, are avenues for cost reduction (Hu et al., 2024). Extension to inverse problems—where dynamical drift 8 or diffusion 9 are unknown—remains an open topic. Precision and stability in the presence of rare event tails are enhanced by robust loss designs, but further work is needed for efficient high-dimensional sampling and inverse solution identification.
Score-fPINN and its variants formalize a unifying, physics-informed, dimension-robust paradigm for machine learning solutions of high-dimensional stochastic PDEs, realizing practical advances in generative modeling, uncertainty quantification, and forward uncertainty propagation in complex, time-dependent systems (Lai et al., 2022, Hu et al., 2024, Hu et al., 2024).