Curriculum Learning via PINNs (CLIP)

Updated 31 January 2026

The paper introduces a physics-guided curriculum that sequentially trains PINNs from reaction-dominated regimes to full spatiotemporal PDEs, improving inference accuracy.
It employs an anchored widening transfer to preserve learned reaction dynamics while gradually incorporating diffusion effects for robust parameter and state recovery.
Empirical assessments show significant MRAE reductions over baseline methods in canonical systems such as λ–ω, Gray–Scott, and Lotka–Volterra RD models.

Curriculum Learning Identification via PINNs (CLIP) is a physics-guided framework for parameter identification and state reconstruction in partially observed reaction–diffusion (RD) systems. CLIP leverages the physical separability inherent in RD models by structuring neural network training as a curriculum—progressing from reaction-dominated regimes to the full spatiotemporal PDE, and utilizing an anchored widening transfer strategy to enhance convergence and robustness. The method is implemented using physics-informed neural networks (PINNs) and achieves substantial accuracy improvements over baseline techniques in canonical and high-dimensional biological applications (Zhou et al., 24 Jan 2026).

1. Reaction–Diffusion System Identification: Formulation

For a system of $N$ reaction–diffusion components on a domain $\Omega\subset\mathbb{R}^d$ over $t\in[0,T]$ , the governing PDE is

$\frac{\partial \mathbf{u}}{\partial t} = \boldsymbol{D}\,\Delta \mathbf{u} + \mathcal{F}\bigl(\mathbf{u};\boldsymbol{\kappa}\bigr), \quad (\mathbf{x},t)\in\Omega\times[0,T],$

where $\mathbf{u}=(u_1,\dots,u_N)^\top$ denotes the state vector, $\boldsymbol{D} = \operatorname{diag}(D_1,\dots,D_N)$ the unknown diffusion coefficients, and $\mathcal{F}$ the nonlinear reaction term parameterized by rates $\boldsymbol{\kappa}\in\mathbb{R}^P$ .

Identification is complicated by partial observation: only a subset $I_{\rm obs}$ of state variables is measured at sensor points $\{(\mathbf{x}_k,t_k)\}_{k=1}^{N_k}$ ,

$u_{\rm obs}^i(\mathbf{x}_k,t_k) = u_i(\mathbf{x}_k,t_k) + \varepsilon_k^i, \quad i\in I_{\rm obs},$

where $\varepsilon_k^i$ encodes additive noise. The joint recovery task is to infer hidden fields $u_j$ ( $j\in I_{\rm hid}$ ), as well as the unknown parameters $(\boldsymbol{D}, \boldsymbol{\kappa})$ .

2. Physics-Informed Neural Network Architecture

CLIP employs a PINN to approximate the full state $\mathbf{u}(\mathbf{x},t)$ by a neural map

$\hat{\mathbf{u}}_\theta(\mathbf{x},t) = \left(\hat u_\theta^1, \dotsc, \hat u_\theta^N\right)^\top,$

where $\theta$ are trainable parameters. The architecture is defined by:

A shared trunk multilayer perceptron (MLP), depth 3, width $W$ , using smooth activations (e.g., $\sin$ or mixed sine–ReLU functions).
$N$ dedicated branches (1–2 layers each) for mapping trunk features to each output variable.
An auxiliary MLP surrogate for smoothing observed data to generate robust Laplacian estimates for curriculum masking; this surrogate is only used in the mask calculation, not in PINN losses.

PDE residuals for each $i$ are computed via automatic differentiation,

$\mathcal{R}_i(\mathbf{x},t) = \partial_t \hat u^i_\theta - D_i\,\Delta \hat u^i_\theta - \mathcal{F}^i\left(\hat{\mathbf{u}}_\theta;\boldsymbol\kappa\right).$

3. Curriculum Learning: Multi-stage Training Workflow

CLIP divides training into three sequential stages:

3.1 Reaction-Dominated Initialization (Stage 0)

A reaction-dominated mask $M(\mathbf{x},t)$ is constructed via Laplacian thresholding on the surrogate-smoothed observed fields. Only points with negligible diffusion (as measured by normalized Laplacian magnitude) are sampled so that the local dynamics are effectively ODE-driven. Stage 0 sets diffusion coefficients to zero (or applies a small PDE-weight) and restricts optimization to reaction kinetics and initialization of hidden state fields.

The loss in Stage 0 is

$\mathcal{L} = \mathcal{L}_\text{data}^{(M)} + \eta_\text{pde} \mathcal{L}_\text{pde}^{(M)} + \mathcal{L}_\text{ic},$

where $(M)$ indicates masking by $M(\mathbf{x},t)$ .

3.2 Anchored Widening Transfer (Stage 1)

After reaction-only pre-training, diffusion terms are re-introduced. To preserve previously learned reaction dynamics, the network width is enlarged by adding new neurons (anchored widening), and two optimizers are employed: inherited parameters (from Stage 0) are updated with tiny learning rates, while new parameters and diffusion coefficients train with standard rates. This anchoring ensures that coupling dynamics are absorbed by the increased network capacity without destroying the reaction sub-solutions.

3.3 Global Fine-Tuning with Adaptive Sampling (Stage 2)

All network and parameter weights are then jointly fine-tuned at a moderate learning rate. To resolve sharp spatiotemporal features, residual-based adaptive distribution (RAD) sampling augments the training set. New collocation points are selected in proportion to normalized residual magnitudes $\varepsilon_k^i = |\mathcal{R}_i(\mathbf{x}_k,t_k)|$ , improving effective coverage of interface and steep-gradient regions.

4. Training Objective and Loss Functions

CLIP utilizes a composite loss function incorporating data matching, PDE residuals, initial condition enforcement, and (optionally) anchoring:

$\mathcal{L} = \mathcal{L}_\text{data} + \eta_\text{pde} \mathcal{L}_\text{pde} + \mathcal{L}_\text{ic}$

Definitions:

Data mismatch:

$\mathcal{L}_\text{data} = \frac{1}{N_m N_k} \sum_{i\in I_\text{obs}} \sum_{k=1}^{N_k} \left| \hat u_\theta^i(\mathbf{x}_k, t_k) - u_\text{obs}^i(\mathbf{x}_k, t_k) \right|$

Physics residual:

$\mathcal{L}_\text{pde} = \frac{1}{N N_k} \sum_{i=1}^N \sum_{k=1}^{N_k} | \mathcal{R}_i(\mathbf{x}_k,t_k) |$

Initial condition:

$\mathcal{L}_\text{ic} = \frac{1}{N N_{k0}} \sum_{i=1}^{N} \sum_{k=1}^{N_{k0}} | \hat u_\theta^i(\mathbf{x}_k, t_0) - u_\text{obs}^i(\mathbf{x}_k, t_0) |$

(Optional) Anchor regularization:

$\mathcal{L}_\text{anchor} = \lambda_\text{anc} \sum_{\theta_i\in\theta_\text{inh}} \| \theta_i - \theta_i^\text{pre} \|^2$

Loss weight scheduling ramps $\eta_\text{pde}$ across stages to adaptively balance physics and data terms.

5. Optimization Techniques and Hyperparameters

PINN weights are initialized with Xavier normalization. Stage 0 employs Adam with learning rate $10^{-3}$ ; Stage 1 splits inherited and new parameters (lr $\in[10^{-8},10^{-5}]$ and $10^{-3}$ , respectively); Stage 2 trains all parameters at $10^{-4}$ . For Min system identification, reaction rates are optimized in log-space and inputs are rescaled by $c_0=10^5$ for numerical stability. Activation choices vary: $\sin(x)$ functions for most benchmarks, custom $\phi(x)=\max(0,\alpha\sin x)$ for Gray–Scott to resolve sharper pulses.

6. Empirical Assessment and Benchmark Results

CLIP was evaluated on three canonical RD systems (λ–ω, Gray–Scott, Lotka–Volterra) and a four-variable Min-protein oscillator in bacterial geometry. Only one (or two) components are observed per system, with hidden variables fully unmeasured.

Training uses approximately 2% downsampled points from a high-resolution solver, and noise levels range from 0% to 10%. Representative mean relative absolute error (MRAE) for CLIP and baselines are:

System	CLIP MRAE (clean)	PINN MRAE	PSO/EnKF
λ–ω	5.50%	7.75%	153%/0.48%
Gray–Scott	9.55% (clean); up to 25.3% (10% noise)	>100%	Failed
Lotka–Volterra RD	≈9.23% (clean)	>2700%	Failed
Min-protein oscillator	18.10% (clean); 23.34% (10% noise)	72.15% (clean)	—

For the Min system, CLIP reconstructs unobserved cytosolic fields and membrane-bound time-series matching the amplitude and frequency of ground-truth oscillations.

7. Mechanistic Analysis: Ablation and Loss Landscape

Ablation experiments show incremental improvements with curriculum components:

Baseline PINN (no curriculum): high MRAE, poor convergence.
+Reaction-only curriculum: moderate improvement.
+Anchored widening transfer (full CLIP): order-of-magnitude error reduction.

Visualization of the loss landscape, using PCA trajectories of parameter vectors, reveals that baseline PINNs yield a highly nonconvex terrain with spurious basins trapping the optimizer. The reaction-only stage produces a smoother pathway in parameter space, and the full CLIP scheme leads to a well-conditioned landscape and robust gradient descent toward the optimum.

In sum, CLIP offers a physics-tailored three-stage curriculum—reaction-only initialization, anchored widening transfer, and adaptive fine-tuning—to jointly infer hidden states and unknown parameters in RD systems from sparse, noisy partial observations. This approach makes explicit use of physical modularity and produces significant gains in both trainability and accuracy over conventional PINNs, ensemble Kalman filters, and population-based optimizers (Zhou et al., 24 Jan 2026).

Markdown Report Issue Upgrade to Chat

References (1)

Physics-guided curriculum learning for the identification of reaction-diffusion dynamics from partial observations (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Curriculum Learning Identification via PINNs (CLIP).