Papers
Topics
Authors
Recent
Search
2000 character limit reached

Curriculum Learning via PINNs (CLIP)

Updated 31 January 2026
  • The paper introduces a physics-guided curriculum that sequentially trains PINNs from reaction-dominated regimes to full spatiotemporal PDEs, improving inference accuracy.
  • It employs an anchored widening transfer to preserve learned reaction dynamics while gradually incorporating diffusion effects for robust parameter and state recovery.
  • Empirical assessments show significant MRAE reductions over baseline methods in canonical systems such as λ–ω, Gray–Scott, and Lotka–Volterra RD models.

Curriculum Learning Identification via PINNs (CLIP) is a physics-guided framework for parameter identification and state reconstruction in partially observed reaction–diffusion (RD) systems. CLIP leverages the physical separability inherent in RD models by structuring neural network training as a curriculum—progressing from reaction-dominated regimes to the full spatiotemporal PDE, and utilizing an anchored widening transfer strategy to enhance convergence and robustness. The method is implemented using physics-informed neural networks (PINNs) and achieves substantial accuracy improvements over baseline techniques in canonical and high-dimensional biological applications (Zhou et al., 24 Jan 2026).

1. Reaction–Diffusion System Identification: Formulation

For a system of NN reaction–diffusion components on a domain ΩRd\Omega\subset\mathbb{R}^d over t[0,T]t\in[0,T], the governing PDE is

ut=DΔu+F(u;κ),(x,t)Ω×[0,T],\frac{\partial \mathbf{u}}{\partial t} = \boldsymbol{D}\,\Delta \mathbf{u} + \mathcal{F}\bigl(\mathbf{u};\boldsymbol{\kappa}\bigr), \quad (\mathbf{x},t)\in\Omega\times[0,T],

where u=(u1,,uN)\mathbf{u}=(u_1,\dots,u_N)^\top denotes the state vector, D=diag(D1,,DN)\boldsymbol{D} = \operatorname{diag}(D_1,\dots,D_N) the unknown diffusion coefficients, and F\mathcal{F} the nonlinear reaction term parameterized by rates κRP\boldsymbol{\kappa}\in\mathbb{R}^P.

Identification is complicated by partial observation: only a subset IobsI_{\rm obs} of state variables is measured at sensor points {(xk,tk)}k=1Nk\{(\mathbf{x}_k,t_k)\}_{k=1}^{N_k},

uobsi(xk,tk)=ui(xk,tk)+εki,iIobs,u_{\rm obs}^i(\mathbf{x}_k,t_k) = u_i(\mathbf{x}_k,t_k) + \varepsilon_k^i, \quad i\in I_{\rm obs},

where εki\varepsilon_k^i encodes additive noise. The joint recovery task is to infer hidden fields uju_j (jIhidj\in I_{\rm hid}), as well as the unknown parameters (D,κ)(\boldsymbol{D}, \boldsymbol{\kappa}).

2. Physics-Informed Neural Network Architecture

CLIP employs a PINN to approximate the full state u(x,t)\mathbf{u}(\mathbf{x},t) by a neural map

u^θ(x,t)=(u^θ1,,u^θN),\hat{\mathbf{u}}_\theta(\mathbf{x},t) = \left(\hat u_\theta^1, \dotsc, \hat u_\theta^N\right)^\top,

where θ\theta are trainable parameters. The architecture is defined by:

  • A shared trunk multilayer perceptron (MLP), depth 3, width WW, using smooth activations (e.g., sin\sin or mixed sine–ReLU functions).
  • NN dedicated branches (1–2 layers each) for mapping trunk features to each output variable.
  • An auxiliary MLP surrogate for smoothing observed data to generate robust Laplacian estimates for curriculum masking; this surrogate is only used in the mask calculation, not in PINN losses.

PDE residuals for each ii are computed via automatic differentiation,

Ri(x,t)=tu^θiDiΔu^θiFi(u^θ;κ).\mathcal{R}_i(\mathbf{x},t) = \partial_t \hat u^i_\theta - D_i\,\Delta \hat u^i_\theta - \mathcal{F}^i\left(\hat{\mathbf{u}}_\theta;\boldsymbol\kappa\right).

3. Curriculum Learning: Multi-stage Training Workflow

CLIP divides training into three sequential stages:

3.1 Reaction-Dominated Initialization (Stage 0)

A reaction-dominated mask M(x,t)M(\mathbf{x},t) is constructed via Laplacian thresholding on the surrogate-smoothed observed fields. Only points with negligible diffusion (as measured by normalized Laplacian magnitude) are sampled so that the local dynamics are effectively ODE-driven. Stage 0 sets diffusion coefficients to zero (or applies a small PDE-weight) and restricts optimization to reaction kinetics and initialization of hidden state fields.

The loss in Stage 0 is

L=Ldata(M)+ηpdeLpde(M)+Lic,\mathcal{L} = \mathcal{L}_\text{data}^{(M)} + \eta_\text{pde} \mathcal{L}_\text{pde}^{(M)} + \mathcal{L}_\text{ic},

where (M)(M) indicates masking by M(x,t)M(\mathbf{x},t).

3.2 Anchored Widening Transfer (Stage 1)

After reaction-only pre-training, diffusion terms are re-introduced. To preserve previously learned reaction dynamics, the network width is enlarged by adding new neurons (anchored widening), and two optimizers are employed: inherited parameters (from Stage 0) are updated with tiny learning rates, while new parameters and diffusion coefficients train with standard rates. This anchoring ensures that coupling dynamics are absorbed by the increased network capacity without destroying the reaction sub-solutions.

3.3 Global Fine-Tuning with Adaptive Sampling (Stage 2)

All network and parameter weights are then jointly fine-tuned at a moderate learning rate. To resolve sharp spatiotemporal features, residual-based adaptive distribution (RAD) sampling augments the training set. New collocation points are selected in proportion to normalized residual magnitudes εki=Ri(xk,tk)\varepsilon_k^i = |\mathcal{R}_i(\mathbf{x}_k,t_k)|, improving effective coverage of interface and steep-gradient regions.

4. Training Objective and Loss Functions

CLIP utilizes a composite loss function incorporating data matching, PDE residuals, initial condition enforcement, and (optionally) anchoring:

L=Ldata+ηpdeLpde+Lic\mathcal{L} = \mathcal{L}_\text{data} + \eta_\text{pde} \mathcal{L}_\text{pde} + \mathcal{L}_\text{ic}

Definitions:

  • Data mismatch:

Ldata=1NmNkiIobsk=1Nku^θi(xk,tk)uobsi(xk,tk)\mathcal{L}_\text{data} = \frac{1}{N_m N_k} \sum_{i\in I_\text{obs}} \sum_{k=1}^{N_k} \left| \hat u_\theta^i(\mathbf{x}_k, t_k) - u_\text{obs}^i(\mathbf{x}_k, t_k) \right|

  • Physics residual:

Lpde=1NNki=1Nk=1NkRi(xk,tk)\mathcal{L}_\text{pde} = \frac{1}{N N_k} \sum_{i=1}^N \sum_{k=1}^{N_k} | \mathcal{R}_i(\mathbf{x}_k,t_k) |

  • Initial condition:

Lic=1NNk0i=1Nk=1Nk0u^θi(xk,t0)uobsi(xk,t0)\mathcal{L}_\text{ic} = \frac{1}{N N_{k0}} \sum_{i=1}^{N} \sum_{k=1}^{N_{k0}} | \hat u_\theta^i(\mathbf{x}_k, t_0) - u_\text{obs}^i(\mathbf{x}_k, t_0) |

  • (Optional) Anchor regularization:

Lanchor=λancθiθinhθiθipre2\mathcal{L}_\text{anchor} = \lambda_\text{anc} \sum_{\theta_i\in\theta_\text{inh}} \| \theta_i - \theta_i^\text{pre} \|^2

Loss weight scheduling ramps ηpde\eta_\text{pde} across stages to adaptively balance physics and data terms.

5. Optimization Techniques and Hyperparameters

PINN weights are initialized with Xavier normalization. Stage 0 employs Adam with learning rate 10310^{-3}; Stage 1 splits inherited and new parameters (lr [108,105]\in[10^{-8},10^{-5}] and 10310^{-3}, respectively); Stage 2 trains all parameters at 10410^{-4}. For Min system identification, reaction rates are optimized in log-space and inputs are rescaled by c0=105c_0=10^5 for numerical stability. Activation choices vary: sin(x)\sin(x) functions for most benchmarks, custom ϕ(x)=max(0,αsinx)\phi(x)=\max(0,\alpha\sin x) for Gray–Scott to resolve sharper pulses.

6. Empirical Assessment and Benchmark Results

CLIP was evaluated on three canonical RD systems (λ–ω, Gray–Scott, Lotka–Volterra) and a four-variable Min-protein oscillator in bacterial geometry. Only one (or two) components are observed per system, with hidden variables fully unmeasured.

Training uses approximately 2% downsampled points from a high-resolution solver, and noise levels range from 0% to 10%. Representative mean relative absolute error (MRAE) for CLIP and baselines are:

System CLIP MRAE (clean) PINN MRAE PSO/EnKF
λ–ω 5.50% 7.75% 153%/0.48%
Gray–Scott 9.55% (clean); up to 25.3% (10% noise) >100% Failed
Lotka–Volterra RD ≈9.23% (clean) >2700% Failed
Min-protein oscillator 18.10% (clean); 23.34% (10% noise) 72.15% (clean)

For the Min system, CLIP reconstructs unobserved cytosolic fields and membrane-bound time-series matching the amplitude and frequency of ground-truth oscillations.

7. Mechanistic Analysis: Ablation and Loss Landscape

Ablation experiments show incremental improvements with curriculum components:

  • Baseline PINN (no curriculum): high MRAE, poor convergence.
  • +Reaction-only curriculum: moderate improvement.
  • +Anchored widening transfer (full CLIP): order-of-magnitude error reduction.

Visualization of the loss landscape, using PCA trajectories of parameter vectors, reveals that baseline PINNs yield a highly nonconvex terrain with spurious basins trapping the optimizer. The reaction-only stage produces a smoother pathway in parameter space, and the full CLIP scheme leads to a well-conditioned landscape and robust gradient descent toward the optimum.


In sum, CLIP offers a physics-tailored three-stage curriculum—reaction-only initialization, anchored widening transfer, and adaptive fine-tuning—to jointly infer hidden states and unknown parameters in RD systems from sparse, noisy partial observations. This approach makes explicit use of physical modularity and produces significant gains in both trainability and accuracy over conventional PINNs, ensemble Kalman filters, and population-based optimizers (Zhou et al., 24 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Curriculum Learning Identification via PINNs (CLIP).