Physics-Informed LSTM Models
- Physics-informed LSTM models are recurrent neural networks that integrate governing equations and physical constraints into their architecture.
- They use composite loss functions to balance empirical data errors with penalties for deviating from physical laws.
- This approach is applied in diverse domains like seismic response, fluid thermodynamics, and chaotic system forecasting, proving enhanced robustness and efficiency.
Physics-Informed Long Short-Term Memory (LSTM) Model
Physics-informed Long Short-Term Memory (LSTM) models are a class of recurrent neural network (RNN) surrogates that integrate domain knowledge, including governing equations and physical constraints, directly into their architecture and/or training objective. Unlike purely data-driven LSTM models, physics-informed variants enforce conformity to physical laws—often in the form of partial differential equations (PDEs), ordinary differential equations (ODEs), conservation constraints, or system-specific constitutive relationships—by augmenting loss functions or injecting physics features. This fusion yields improved accuracy, robustness, generalizability, and interpretability in time-series prediction for complex dynamical systems across scientific and engineering domains.
1. Core Principles and Motivations
Physics-informed LSTM models exploit the strengths of the LSTM cell—gated memory mechanisms that preserve gradients and encode long-term temporal dependencies—while explicitly embedding knowledge of underlying physical processes. This approach addresses key limitations observed in classical deep learning surrogates for dynamical systems:
- Generalization across domains and regimes: Purely data-driven models often fail to extrapolate outside training distributions or under limited data, whereas embedding governing equations regularizes learning.
- Physical consistency and interpretability: Enforcing residuals of physics (e.g., mass conservation, energy balance, or specific ODE/PDE forms) ensures predictions respect domain constraints.
- Data efficiency: Physics-informed LSTM models require less labeled data; physical loss terms can be enforced at additional collocation points even in unobserved regimes (Zhang et al., 2020).
- Numerical stability and convergence: Incorporating physics can mitigate overfitting, oscillatory/chaotic behavior, and divergence in challenging regimes (Tao et al., 25 Dec 2025).
2. Architectural Variants and Mathematical Formulation
Multiple architectures instantiate physics-informed LSTM frameworks, often tailored to the domain or type of spatiotemporal data:
(a) Sequential LSTM Architectures with Physics Loss
Standard LSTM update equations are preserved, typically with stacked LSTM layers receiving time series (e.g., state, control, or feature sequences), producing outputs (e.g., displacement, velocity, temperature) (Lahariya et al., 2022, Biswas et al., 26 Nov 2025):
Predicted outputs are constrained by specialized physics-informed loss terms enforcing ODE/PDE residuals or discretized equations (see Section 3).
(b) Hybrid Architectures: U-Net/LSTM, ConvLSTM, Graph-LSTM
- Encoder-Decoder plus LSTM: Features are first extracted from input sequences (e.g., ground acceleration, image, or spatial snapshots) using 1D/2D CNN architectures such as causal U-Net or CAE, then propagated temporally by stacked LSTM layers (Biswas et al., 26 Nov 2025, Menicali et al., 16 May 2025).
- Graph SAGE-LSTM/GCN-LSTM: Node embeddings are computed via Graph Neural Networks (GNNs), encoding spatial topology or mesh, then evolved in time-sequence using LSTM gating (elementwise or graph-convolutions replace standard affine transforms) (Liu et al., 2024, Razavi et al., 18 Sep 2025).
- Multi-branch LSTM: Separate LSTMs handle state evolution, restoring forces, or hysteretic states; outputs and their time-differentiated forms feed into a composite loss (Zhang et al., 2020).
3. Loss Formulation and Physics Constraints
A defining feature is the composite loss function:
where is the empirical/data-driven error (e.g., MSE to observations), and penalizes violation of discretized or continuous governing equations (PDE/ODE residuals):
- State evolution constraint: Enforce (ODE form) or corresponding finite-difference residuals at collocation points (Halder et al., 2023, Biswas et al., 26 Nov 2025).
- Physics-informed regularizer: Penalize time/space derivatives of the LSTM outputs that deviate from physical models, e.g., energy conservation, constitutive or kinematic relations (Özalp et al., 2023, Lahariya et al., 2022, Zhang et al., 2020).
- Physical feature injection: Concatenate physical features, such as environmental covariates from weather or climate models, as node features in graph-based LSTMs (Liu et al., 2024).
- Boundary and symmetry constraints: Impose exact or regularized adherence to boundary conditions (e.g., no-slip, mass conservation) (Tao et al., 25 Dec 2025).
Loss weights are tuned to achieve a trade-off: pure data loss minimizes empirical risk, but excessive disregard for degrades physical fidelity.
4. Domain-Specific Applications and Quantitative Results
Physics-informed LSTM surrogates have been deployed in diverse domains:
| Domain (Task) | Model Variant | Key Physics Constraints | Reported Metrics / Results | Reference |
|---|---|---|---|---|
| Seismic response prediction of structures | U-Net-LSTM, finite-diff enforcement | ODE motion equations | Corr. up to 0.998; 2-3 orders speedup | (Biswas et al., 26 Nov 2025) |
| Nonlinear structural metamodeling | Multi-LSTM (PhyLSTM²/³) | EOM, kinematics, hysteresis law | in >80% runs | (Zhang et al., 2020) |
| Fluid thermodynamics (RBC) | ConvLSTM + CAE | PDE residuals (Navier–Stokes, etc.) | MSE = 0.02; 140× speedup | (Menicali et al., 16 May 2025) |
| Chaotic system forecasting | PI-LSTM | ODEs (Lorenz-96, etc.) | RMSE 0.3 for hidden states | (Özalp et al., 2023) |
| Energy systems / cooling towers | PhyLSTM | Conservation ODE | 2% RMSE, fast convergence | (Lahariya et al., 2022) |
| Electrohydrodynamics | LSTM-PINN | Steady PDEs with BCs | Stable up to 7e-3 LR, low final loss | (Tao et al., 25 Dec 2025) |
| Polar ice/cryosphere (graph) | GraphSAGE-LSTM with MAR features | Physics via node feature injection | 10% lower RMSE vs. non-physics GNN | (Liu et al., 2024) |
This table highlights the diversity of architecture and constraint choices, with consistent improvements over both baseline LSTM and traditional physics-agnostic ML or numerical baselines.
5. Training Paradigms and Implementation
Data regimes and hardware-bounded constraints motivate varied training procedures:
- Data splits: Cases with limited (10 records) versus abundant (50+) training examples; validation on held-out or cross-regime samples (Biswas et al., 26 Nov 2025).
- Optimization: Standard choice is Adam optimizer, with learning rates , often followed by L-BFGS refinement in reduced configurations (Lahariya et al., 2022, Zhang et al., 2020).
- Minibatching/sequencing: Sequence length () is a key consideration; for spatial surrogates, graph- or spatial-dimension batching needed (Razavi et al., 18 Sep 2025, Liu et al., 2024).
- Automatic differentiation or explicit finite difference: Time/space derivatives for physics loss are often implemented via explicit finite-difference convolution (e.g., as in (Biswas et al., 26 Nov 2025)) or using PyTorch autograd on LSTM outputs (Lahariya et al., 2022, Menicali et al., 16 May 2025).
- Loss weighting and dynamic adjustment: Some models modulate data/physics loss influence using dynamic weight averaging or gradient norm rescaling for balance (e.g., (Menicali et al., 16 May 2025)).
6. Comparative Evaluation and Robustness
Quantitative studies consistently show:
- Accuracy & correlation: Physics-informed LSTMs achieve superior time-history correlation (often vs $0.7$ for physics-agnostic baselines) and lower relative or absolute prediction error, robustly across data regimes (Biswas et al., 26 Nov 2025, Özalp et al., 2023).
- Generalization: LSTM architectures regularized with physics losses extrapolate reliably to unseen regimes, with little degradation under data scarcity, outperforming competing black-box models (Zhang et al., 2020, Biswas et al., 26 Nov 2025).
- Stability: LSTM-PINN is observed to avoid divergence and boundary artifacts under aggressive learning rates and complex PDEs, in contrast to MLP-PINN approaches (Tao et al., 25 Dec 2025).
- Physical interpretability: Injection of domain constraints allows recovery of latent physical states (e.g., restoring force, hysteresis, or unmeasured variables) without direct supervision (Özalp et al., 2023, Zhang et al., 2020).
- Computational efficiency: Surrogate inference times are reduced by orders of magnitude compared to solver-based approaches (e.g., for turbulent RBC (Menicali et al., 16 May 2025), for structural FEM (Biswas et al., 26 Nov 2025)).
7. Limitations, Extensions, and Outlook
Identified limitations and prospective extensions include:
- Expressivity and depth: Single-layer designs may be insufficient for capturing multiscale processes; deeper or hybrid LSTM-Transformer architectures, higher-resolution spatial encoders (e.g., vision transformers, U-Nets) are active areas of development (Menicali et al., 16 May 2025, Biswas et al., 26 Nov 2025).
- Physics loss formulation: Efficacy depends on precise formulation; choice between continuous (AD-based) vs. discretized (finite-diff or black-box) derivatives impacts training stability and computational burden (Halder et al., 2023).
- Constraint enforcement: Some architectures enforce physics only via auxiliary inputs, not direct loss regularization—a distinction critical in evaluating physical consistency (Liu et al., 2024).
- Scalability and complexity: Large graph sizes or high-resolution spatio-temporal domains present memory and training challenges; batching heuristics and 3D convs are practical remedies (Razavi et al., 18 Sep 2025, Menicali et al., 16 May 2025).
- Uncertainty quantification: Incorporation of conformal prediction or Bayesian layers is essential where deterministic long-horizon forecasts are ill-posed (Menicali et al., 16 May 2025).
- Application domains: Physics-informed LSTM surrogates continue to expand to new areas, e.g., hybrid flow/transport, geoscientific forecasting, thermal/electrohydrodynamic flows, and chaotic state reconstruction (Tao et al., 25 Dec 2025, Özalp et al., 2023).
Physics-informed LSTM models represent an intersection of deep sequence modeling and inductive scientific priors, enabling data-efficient, robust, and physically plausible surrogates across complex dynamical systems (Biswas et al., 26 Nov 2025, Lahariya et al., 2022, Özalp et al., 2023, Tao et al., 25 Dec 2025, Menicali et al., 16 May 2025, Zhang et al., 2020, Liu et al., 2024, Razavi et al., 18 Sep 2025, Halder et al., 2023).