Dynamical Interpretability

Updated 24 December 2025

Dynamical interpretability is the ability to extract, understand, and communicate the evolution, sensitivities, and invariant structures of models governing complex dynamical systems.
It employs techniques such as effective receptive fields, sensitivity analysis, and dual-space architectures to link data-driven behaviors with physical laws.
Integrating physics-based losses and symmetry constraints enhances both local and global interpretability while addressing challenges in nonlinear or non-constant coefficient problems.

Dynamical interpretability refers to the capacity to extract, understand, and communicate the underlying mechanisms, sensitivities, and invariants of learned or engineered models governing dynamical systems, especially in high-dimensional or data-driven settings. Unlike static interpretability, which focuses on mappings from input to output at a single instant, dynamical interpretability concerns elucidating the evolution of states and the structure of flows, attractors, and invariant manifolds over time. Contemporary research on dynamical interpretability encompasses neural operators, machine-learned dynamical models, and principled frameworks integrating physics, symmetry, and explainable architectures (Gao et al., 3 Oct 2025, Matei et al., 2020, Kim et al., 22 Oct 2025, Sedler et al., 2022, Sagodi et al., 7 Jul 2025).

1. Classes of Dynamical Models and Operator Learning

Dynamical systems modeling splits naturally into analytic, physics-based models and data-driven models. Within learning-based operator approaches, two classes dominate:

Spatial-Domain Neural Operators: Act directly on discretized physical domains. These employ convolutional, attention-based (transformer) or integral-operator architectures (e.g., Convolutional Neural Operator, Galerkin/Attention-based Transformer NOs), enabling elementwise or local receptive field interpretation. The integral layer,

$(K v)(x) = \int_\Omega \kappa_\theta(x, y) v(y) dy,$

with translation-invariant $\kappa_\theta(x, y)$ reducing to convolutional forms, supports both analytic and numerical sensitivity analysis at a physical location (Gao et al., 3 Oct 2025).

Functional-Domain Neural Operators: Learn mappings between sets of functions (e.g., as in DeepONet or T1), parameterized through basis expansions,

$G(a)(x) = \sum_{i=1}^p b_i(a) t_i(x).$

The modal coefficients $b_i(a)$ and trunk evaluations $t_i(x)$ enable global representation but do not admit straightforward physical sensitivity interpretation at the gridpoint level (Gao et al., 3 Oct 2025).

This classification shapes the interpretability workflow: local, pointwise, and spatial diagnostics are accessible in spatial-domain models, whereas only modal/global analysis applies to functional-domain models.

2. Mechanisms for Dynamical Interpretability

2.1 Effective Receptive Field (ERF) and Sensitivity Analysis

For spatial-domain neural operators, interpretability is achieved via effective receptive field (ERF) kernels:

$\text{erf}(x', x) = \frac{\partial u(x')}{\partial a(x)},$

measuring how infinitesimal perturbations to the input at $x$ propagate to outputs at $x'$ through the learned dynamics. In linear or eigenmode-decomposable problems (e.g., linear wave equation), ERF can be computed analytically. For general nonlinear or data-driven models, automatic differentiation enables numerical ERF computation post hoc, revealing spatial structure, propagation patterns (e.g., wavefront arcs, advective envelopes, or localized response), and enabling the identification of mechanistic analogs to analytic Green's functions (Gao et al., 3 Oct 2025).

2.2 Diagnostic and Model Comparison Tools

Interpretability is further strengthened by comparing ERF-derived sensitivity patterns or spectral error decompositions across different architectures and tasks. For example, in fluid dynamics benchmarks:

CNO and Galerkin/Transformer NOs can recover theoretically predicted sensitivity loci (wavefront arcs) with high fidelity.
Fourier-based operators (FNO) may exhibit global trends with added noise, which is suppressed by symmetry-encoded variants (e.g., Group-FNO).
Functional methods (DeepONet, T1) typically lack localization and cannot elucidate the flow of influence from input to output.

2.3 Dual-Space and Multi-Scale Architectures

Pure spectral models can capture global, low-frequency dynamics, but struggle with locality and high-frequency content. Dual-space (hybrid) architectures—combining spectral operators (such as FNO layers) with local spatial convolutions—systematically reduce both low- and high-frequency errors and yield multi-resolution interpretability:

$u_{\ell+1}(x) = W u_\ell(x) + \mathcal{F}^{-1}\left[P_\theta(\kappa)\cdot\mathcal{F}\{u_\ell\}\right](x) + \text{Conv}_{3\times3}[u_\ell](x).$

These models simultaneously facilitate physical interpretability via ERF and spectral diagnostics, and set state-of-the-art benchmarks on tasks such as multi-scale Darcy, Helmholtz, and the Allen–Cahn equation (Gao et al., 3 Oct 2025).

3. Incorporation of Physical Principles and Inductive Bias

Interpretability is fundamentally improved when models are imbued with physics-motivated structure or loss functions:

Physics-Informed Loss: Incorporation of PDE-residual penalties

$L_\text{total} = L_\text{data}(\mathcal{G}_\theta(a), u) + \lambda \| \mathcal{N}(a, \mathcal{G}_\theta(a)) - f \|^2_{L^2},$

where $\mathcal{N}(a, u)=0$ is the governing equation, enforces physical admissibility of solutions, and constrains the function space to be interpretable with respect to known invariants or conservation laws.

Inductive Structural Bias:
- Equivariance constraints enforce the neural operator's commutation with symmetry group actions, crucial for capturing conservation laws (e.g., rotation, translation, divergence-free or symplectic structure).
- Example: Group-FNO achieves rotational symmetry, eliminating non-physical artifacts from learned ERFs in wave problems (Gao et al., 3 Oct 2025).
- Physics-inspired modules or constraints, such as divergence-free layers or enforcing mass/energy conservation, further restrict learned dynamics to interpretable manifolds (Matei et al., 2020).

These approaches yield neural operators that are not merely post hoc explainable, but inherently interpretable by design.

4. Quantitative and Qualitative Evaluation of Dynamical Interpretability

Empirical validation of dynamical interpretability requires both quantitative metrics and qualitative inspection:

Task/Data (Tested Model)	ERF Structure	Quantitative Error	Special Insight
Wave equation (CNO, GT)	Recovers arc of front	CNO: matches PDE	Spatial ERF reveals wavefront
Navier–Stokes (CNO)	Expanding radius, localization with lower ν	ERF matches advection & eddy scales
Darcy, Helmholtz (FNO+Conv)	Improved across freq bands	FNO₍3×3₎: 0.51% vs FNO: 0.69% (Darcy)	High-freq/low-freq error reduction
Allen–Cahn	Dual-space outperforms all	FNO₍3×3₎: 0.19%	Captures both interfaces and bulk

These results demonstrate that high dynamical interpretability is concomitant with predictive accuracy, and that hybrid models with built-in structure realize both objectives most effectively (Gao et al., 3 Oct 2025).

5. Limitations and Special Cases

The methodology faces critical limitations:

Analytical ERFs are intractable for nonlinear or nonconstant-coefficient PDEs, restricting closed-form interpretability to linear or well-characterized systems.
Numerically computed ERFs capture only spatial dependencies in grid-based or hybrid models, missing temporal coupling or conserved-invariant structures.
Functional (basis) models do not offer spatial ERFs, so only global or modal interpretability is possible—these cannot reveal localized, physically meaningful causal chains with respect to perturbation.
For tasks without spatial coupling (e.g., pure reaction kinetics), ERFs provide no insight.

A generalizable, model-agnostic explanation framework for learned operators remains an open challenge (Gao et al., 3 Oct 2025).

6. Comparative Perspectives and Extensions

Dynamical interpretability extends beyond neural operators:

Port-Hamiltonian (p-H) Networks directly encode energy storage, dissipation, and interconnection via interpretable constructs, ensuring global passivity and modular causal analysis (Matei et al., 2020).
Feature-based ESNs link block-structured reservoirs to input-feature attribution, providing interpretable weights quantifying each feature's dynamical relevance (Goswami, 28 Mar 2024).
DEIM-based Diagnostics offer interpretable spatial modes and highlight on-the-fly model failures by tracking adaptively selected important locations (Kim et al., 22 Oct 2025).
Bayesian and Causal Models decompose the Markovian transition structure, explicitly representing uncertainties, regime switches, and learned causal graphs over time (Kaiser et al., 2019, Zhao et al., 2023).
Low-dimensional phase portrait and archetype analysis identify and align latent motifs (attractors, slow manifolds, limit cycles) with canonical computational forms, quantifying similarity via diffeomorphism-aware metrics and tracking compositional changes in model learning (Sagodi et al., 7 Jul 2025, Guilhot et al., 31 Oct 2024).

Each of these frameworks produces dynamical models whose internal structure is accessible to domain experts and whose components map onto physically or biologically meaningful primitives.

7. Open Problems and Future Directions

While progress has been made, several obstacles remain:

Generalizable post hoc interpretability methods for arbitrary data-driven dynamical models, particularly in high-dimensional, non-grid, or irregular domains.
Integration of physics-based interpretability principles (symmetry, conservation) in end-to-end differentiable settings.
Quantifying and benchmarking interpretable architectures against black boxes in large-scale, multi-modal problems.
Formalizing the relationship between a model’s dynamical interpretability and its out-of-domain generalization.

The field strongly advocates for the principled design of operator architectures, explanation modules (such as ERF/spectral sensitivity), and physics-informed inductive biases to render machine-learned dynamical systems transparent, robust, and physically meaningful (Gao et al., 3 Oct 2025).