Geometric Hamiltonian Neural Networks

Updated 11 December 2025

GeoHNNs are neural architectures that integrate symplectic, Riemannian, and Lie group constraints to accurately model Hamiltonian dynamics.
They employ parameterization strategies such as direct networks, SPD-manifold models, and graph-based methods to guarantee energy conservation and long-term predictive stability.
Training leverages geometric loss terms and symplectic integrators to enforce invariance and achieve scalability with zero-shot generalization.

Geometric Hamiltonian Neural Networks (GeoHNNs) are a class of neural architectures that rigorously encode physical symmetries and geometric structures of Hamiltonian systems within the neural modeling of dynamical phenomena. By incorporating symplectic and Riemannian geometric constraints—often at the level of architecture, parameterization, or training objective—GeoHNNs are designed to address the limitations of generic machine learning models when applied to physical systems, particularly issues of energy drift, instability, and failure to generalize due to ignorance of underlying conservation laws and symmetries. GeoHNNs have been instantiated across Euclidean, statistical, and manifold-structured phase spaces; in both low- and high-dimensional regimes; and for both continuous and graph-structured domains. They regularly outperform non-geometric baselines in long-term predictive stability, energy conservation, and scalability.

1. Foundational Principles: Hamiltonian Mechanics and Geometric Priors

GeoHNNs are grounded in the canonical formulation of Hamiltonian systems: phase space $\mathcal{M} \cong \mathbb{R}^{2n}$ equipped with the canonical symplectic form

$\omega = \sum_{i=1}^n dq^i \wedge dp_i,$

and a Hamiltonian function $H(q, p)$ generating dynamics by

$\dot{q} = \frac{\partial H}{\partial p}, \quad \dot{p} = -\frac{\partial H}{\partial q}.$

Geometric structure is typically incorporated as follows:

Symplectic geometry: Ensures that model flows preserve phase space volume and energy (exact or up to a controlled numerical error), either by direct differentiation of a scalar Hamiltonian or via symplectic integration schemes (David et al., 2021, Zhu et al., 2020, Tong et al., 2020).
Riemannian geometry: The configuration-dependent inertia matrix $M(q)$ is modeled as a point on the SPD manifold $S_{++}^n$ with the affine-invariant metric, guaranteeing positive-definiteness and geometric fidelity (Aboussalah et al., 21 Jul 2025, Friedl et al., 29 Sep 2025).
Symmetry constraints: Automating the detection and enforcement of Lie group symmetries (e.g., translations, rotations) using learnable Lie algebra elements, or building equivariances directly into the parameterization (Dierkes et al., 2023, Rahma et al., 6 Jun 2025).

Parameterization choices such as Kolmogorov–Arnold decompositions (Wu et al., 26 Aug 2025), Taylor series expansions (Tong et al., 2020), or message-passing on graphs (Rahma et al., 6 Jun 2025, Kang et al., 2023) are selected to align with problem geometry.

2. Model Architectures and Parameterization Strategies

GeoHNN architectures reflect the geometric requirements of the target physical domain:

Direct Hamiltonian Networks: $H_\theta(q, p)$ is parameterized by a neural network (e.g., MLP, B-spline modules) and its gradient used to define the vector field, ensuring intrinsic symplecticity (Zhu et al., 2020, Wu et al., 26 Aug 2025).
SPD-Manifold Inertia Modeling: Inertia is represented as $M(q)^{-1} = \operatorname{Exp}_{M_0}(N_M(q; \theta_M))$ , where $\operatorname{Exp}_{M_0}$ is the manifold exponential mapping a predicted tangent vector to $S_{++}^n$ , enforcing positive-definiteness by construction (Aboussalah et al., 21 Jul 2025, Friedl et al., 29 Sep 2025).
Symplectic Autoencoders and ROMs: High-dimensional systems are encoded into a latent space via a symplectic (biorthogonal) autoencoder. This enforces symplecticity on the reduced space via conditions such as $\Psi_l^T\Phi_l = I$ (where $\Phi_l, \Psi_l$ are encoder/decoder weights), with both reconstruction and dynamics enforced by loss design and optimization on the biorthogonal/SPD manifolds (Aboussalah et al., 21 Jul 2025, Friedl et al., 29 Sep 2025).
Graph-Based Models: For N-body systems, each node/particle is encoded with translation- and rotation-invariant coordinates. Hamiltonian graph neural networks employ permutation-invariant message passing and produce a graph-parametrized Hamiltonian (Rahma et al., 6 Jun 2025, Kang et al., 2023).
Kolmogorov–Arnold Representation: A sum of univariate neural modules realizes the Hamiltonian in a functional decomposition, improving adaptation to high-frequency or multi-scale dynamics (Wu et al., 26 Aug 2025).
Statistical Manifold Models: Networks defined directly on a statistical manifold (e.g., lognormal) harness the manifold's geometry; affine transformations and activations emerge from the symplectic and Lie group structure of the underlying parameter space (Assandje et al., 30 Sep 2025).

3. Training Procedures and Geometric Enforcement

Training strategies for GeoHNNs are distinguished by explicit preservation of the symplectic structure and symmetry constraints:

Symplectic Loss and Integrators: Loss terms compare true and predicted dynamics using trajectories propagated with symplectic integrators (e.g., Euler, implicit midpoint, Forest–Ruth) to ensure that the discrete-time learning process matches the true or modified Hamiltonian flow (David et al., 2021, Zhu et al., 2020, Tong et al., 2020).
Geometric Constraints in Optimization: Optimization methodologies include Riemannian Adam for weights on SPD and biorthogonal manifolds; natural gradient descent on the statistical manifold with respect to the Fisher–Rao metric (Aboussalah et al., 21 Jul 2025, Friedl et al., 29 Sep 2025, Assandje et al., 30 Sep 2025).
Random Feature Parameterization: In Hamiltonian graph networks, random feature (ELM or SWIM) initialization with convex linear readout allows for training up to $600\times$ faster than gradient-based methods while maintaining symmetry and accuracy. Zero-shot scalability to massive graphs is achieved (Rahma et al., 6 Jun 2025).
Symmetry Losses: Joint minimization over the Hamiltonian network and Lie algebra generators regularizes the model towards symmetry invariance; loss terms penalize deviation of the Hamiltonian from invariance under the candidate symmetry group actions (Dierkes et al., 2023).
Endpoint-Only Supervision: Some approaches (e.g., Taylor-nets) rely on sparse supervision using only initial and final trajectory data, while the symplectic structure is enforced architecturally and through the integrator (Tong et al., 2020).

4. Empirical Performance and Benchmarking

GeoHNNs demonstrate consistent improvements in accuracy, long-term stability, and conservation compared to non-geometric baselines:

System	Baseline Error/Drift	HNN Error/Drift	GeoHNN Error/Drift	Reference
Mass–spring (low-dim)	error $10^{-2} \to 10^0$	$10^{-3}\!-\!10^{-2}$	$10^{-3}$ , drift $10^{-2}$	(Aboussalah et al., 21 Jul 2025)
3 Coupled oscillators	error $\to 10^0$	$2\!-\!5\times 10^{-2}$	$10^{-2}$	(Aboussalah et al., 21 Jul 2025)
Two-body problem	error $\to 10^2$ , drift $10^3$	error $10^1$ , drift $10^2$	error $1$, drift $10^{-1}$	(Aboussalah et al., 21 Jul 2025)
High-dim cloth, 501 DoF	pos $1.82$, mom $6.26$	—	pos $0.18$, mom $1.05$	(Aboussalah et al., 21 Jul 2025)
N-body graph dynamics	N/A	N/A	MSE $9\times10^{-5}$ , $<1\%$ error for $10^4$ nodes	(Rahma et al., 6 Jun 2025)
Stability (HDG, 64 layers)	$>40\%$ accuracy drop (GAT)	$<2\%$ GeoHNN	Stable	(Kang et al., 2023)

Performance improvements are particularly marked for long-term rollouts (stability over $t\sim50$ and beyond), high-dimensional reductions, and out-of-distribution generalization (e.g., zero-shot to thousands of nodes (Rahma et al., 6 Jun 2025)).

5. Symmetry, Invariance, and Adaptivity

Built-in enforcement and/or discovery of physical symmetries are defining features of GeoHNNs:

Permutation, rotation, translation invariance: Achieved via message-passing aggregation, invariant coordinate transforms, and node/edge embeddings in graph networks (Rahma et al., 6 Jun 2025).
Automatic symmetry detection: Joint learning of symmetry generators and energy via a Lie algebraic framework, without custom equivariant layers but enforced by the loss (Dierkes et al., 2023).
Manifold adaptivity: GeoHNNs (notably on graphs) learn and interpolate arbitrary geometries (curvature, topology, etc.), handling mixtures (e.g., hyperbolic, Euclidean graphs) absent any prior parametric assumption (Kang et al., 2023).
Statistical manifolds: Embedding networks on spaces such as the lognormal manifold clarifies the geometric origin of the network weights, activations, and transformations, leading to intrinsically interpretable architectures (Assandje et al., 30 Sep 2025).

6. Extensions, Scalability, and Future Directions

Ongoing research and open directions for GeoHNNs include:

High-dimensional and structured systems: Symplectic ROMs/autoencoders enable scalable embedding/reduction for >500 DoF systems (e.g., deformable cloth, vortex lattices), maintaining symplectic structure and outperforming black-box ROMs (Aboussalah et al., 21 Jul 2025, Friedl et al., 29 Sep 2025).
Random feature models: Show dramatic speedups and zero-shot scaling; future extensions may combine random features with attention mechanisms or handle time-varying topologies (Rahma et al., 6 Jun 2025).
Dissipation, control, stochasticity: Extensions to Rayleigh-damped, controlled, and stochastic settings are feasible using additional geometric constraints, port-Hamiltonian formalisms, or stochastic manifolds (Aboussalah et al., 21 Jul 2025, Friedl et al., 29 Sep 2025).
Function-approximation theory: Kolmogorov–Arnold decompositions unlock interpretable, localized function spaces, especially for stiff or multi-scale systems (Wu et al., 26 Aug 2025).
Computational cost: Symplectic and Riemannian constraints increase compute time (3–12× in some benchmarks), but yield orders-of-magnitude improvements in fidelity, stability, and generalization (Aboussalah et al., 21 Jul 2025).
Generalization: GeoHNNs routinely achieve zero-shot scaling and robust generalization from small or low-dimensional training to massive systems (Rahma et al., 6 Jun 2025, Tong et al., 2020).

7. Summary and Significance

GeoHNNs explicitly embed the geometry of physical law—symplectic and Riemannian constraints, symmetry invariances, and manifold reductions—at the core of neural dynamics modeling. Empirical studies demonstrate pronounced improvements over traditional neural and non-geometric physics-informed networks: reduced energy drift, improved trajectory accuracy, long-term stability, and enhanced generalization. By closely matching the mathematical and physical structures of underlying systems, GeoHNNs establish a paradigm shift towards geometric learning for complex dynamical phenomena (Aboussalah et al., 21 Jul 2025, Rahma et al., 6 Jun 2025, Friedl et al., 29 Sep 2025, David et al., 2021, Dierkes et al., 2023, Zhu et al., 2020, Tong et al., 2020, Kang et al., 2023, Wu et al., 26 Aug 2025, Assandje et al., 30 Sep 2025).