Hamiltonian ODE Graph Networks

Updated 12 January 2026

Hamiltonian ODE Graph Networks are advanced neural architectures that integrate Hamiltonian mechanics with graph-based learning using ODE integration.
They parameterize node transitions with learnable energy functions and leverage symplectic structures to ensure energy conservation and stability.
HOGNs enable practical applications in node classification, link prediction, and physical simulation while outperforming traditional GNNs in robustness and expressivity.

Hamiltonian ODE Graph Networks (HOGN) are a class of neural architectures that integrate Hamiltonian mechanics into graph-based learning frameworks. By parameterizing node feature transitions as ODEs governed by learnable Hamiltonian energy functions, HOGN architectures capture both the relational structure of graphs and the geometric or physical invariants of dynamical systems. HOGNs serve as a general bridge between geometric deep learning on graphs, nonlinear ODE flows, and the inductive bias of energy-conserving dynamics, enabling enhanced adaptability, stability, and interpretability in applications ranging from node embedding to physical simulation (Kang et al., 2023, Bishnoi et al., 2023, Sanchez-Gonzalez et al., 2019).

1. Hamiltonian Formulation in Graph Neural Networks

HOGNs formalize node embedding or dynamical evolution on graphs as Hamiltonian flow on a phase-space. Each node $n$ is embedded as a tuple $(q_n, p_n) \in \mathbb{R}^{2d}$ consisting of a position vector $q_n$ and an associated momentum $p_n$ . The momentum may be computed from the position via a trainable map $Q_\phi(q_n)$ , typically implemented as an MLP. The system is endowed with a learnable, smooth scalar Hamiltonian function $H_\theta: \mathbb{R}^{2d} \to \mathbb{R}$ (per node or globally), which generalizes the total energy concept from classical mechanics. Canonical Hamiltonian forms include metric-based quadratics (e.g., $H(q,p) = \frac{1}{2} p^\top g_\psi(q) p$ with $g_\psi$ a learnable diagonal matrix), unconstrained MLP parameterizations, and convex or relaxed Hamiltonians allowing for non-conservative phenomena (Kang et al., 2023).

A fixed symplectic structure, encoded by the Poisson matrix $J = \begin{pmatrix} 0 & I_d \ -I_d & 0 \end{pmatrix}$ , defines the canonical bracket and underpins energy conservation. Some extensions learn $J$ or more general skew-symmetric matrices, leveraging the Darboux theorem to guarantee local equivalence to canonical form (Kang et al., 2023, Liu et al., 2023).

2. Continuous-Time Node Updates via ODE Integration

The time evolution of node states is governed by the Hamiltonian ODE:

$\frac{dz_n}{dt} = J \nabla_{z_n} H_\theta(z_n)$

with component-wise updates: $\dot q_n^i = \frac{\partial H_\theta}{\partial p_{n,i}}, \qquad \dot p_{n,i} = -\frac{\partial H_\theta}{\partial q_n^i}$ For multi-node graphs, the states are stacked and integrated simultaneously. Notably, in some HOGN instantiations, the ODE update for each node is independent; graph topology only enters through post-ODE aggregation (Kang et al., 2023), while in physics-motivated systems, edge interactions are embedded in the Hamiltonian, ensuring message-passing during ODE evolution (Bishnoi et al., 2023, Sanchez-Gonzalez et al., 2019).

ODE integration is performed per layer, using methods such as explicit Euler, Runge–Kutta, or symplectic leapfrog/velocity-Verlet. While Euler integration is not strictly symplectic, HOGNs empirically retain sufficient conservation for stability advantages (Kang et al., 2023).

3. Hamiltonian Parameterization and Symplectic Structure

HOGN architectures offer substantial flexibility in parameterizing $H_\theta$ :

Metric-based (geometric) Hamiltonian: $H(q,p) = \frac{1}{2} p^\top g_\psi(q) p$ where $g_\psi(q)$ is a diagonal (positive or indefinite, via signature hyperparameters), guaranteeing geodesic-like or cogeodesic flows.
Unconstrained/FC Hamiltonian: $H(q,p) = \operatorname{MLP}_\theta([q \| p])$ , omitting explicit geometric priors.
Convex Hamiltonian: Ensures convexity in $(q, p)$ via non-negative weights and monotone activations, facilitating a dual Lagrangian interpretation.
Relaxed Hamiltonian: Augments the momentum equation, e.g., $\dot p = - \partial_q H + f_\phi(q)$ to accommodate energy dissipation or external forcing.
Learnable symplectic form: Replaces the fixed $J$ with a data-dependent skew matrix, though, by Darboux's theorem, this yields similar results to $J$ (Kang et al., 2023).

SAH-GNN further advances this direction by optimizing a layer-specific, learnable symplectic matrix $S$ on the symplectic Stiefel manifold $St^J$ , maintaining $S^\top J S = J$ through Riemannian optimization, thus allowing the symplectic geometry to adapt to the data (Liu et al., 2023).

4. Incorporation into Graph Neural Networks

HOGN is implemented as a stack of Hamiltonian ODE layers, each followed by neighborhood aggregation. After ODE integration, node positions $\tilde q_n$ are updated via standard GNN schemes, such as one-hop averaging:

$q_n^{(\ell+1)} = \tilde q_n + \frac{1}{|\mathcal{N}(n)|} \sum_{m \in \mathcal{N}(n)} \tilde q_m$

This produces the next-layer input for multi-layer architectures. For message-passing physical simulators, the Hamiltonian is decomposed into nodewise (kinetic) and edgewise (potential) contributions using GNN blocks, ensuring permutation invariance and relational expressivity (Bishnoi et al., 2023, Sanchez-Gonzalez et al., 2019).

Training is end-to-end, typically using mean-squared error for physical simulation (predicting $(q,p)$ at the next timestep), cross-entropy for node classification, or binary cross-entropy (with negative sampling) for link prediction (Kang et al., 2023, Bishnoi et al., 2023, Sanchez-Gonzalez et al., 2019). Gradients are computed through the ODE solver via the continuous-adjoint method, and optimization employs Adam with weight decay.

5. Geometric Adaptivity, Stability, and Expressivity

Energy conservation in HOGN follows directly from the Hamiltonian structure: for $\dot z = J \nabla H$ , $d_t H(z(t)) = 0$ holds exactly in continuous time. This endows HOGNs with intrinsic stability—the learned oscillator property prevents feature explosion or collapse and mitigates the oversmoothing endemic to deep GNNs. Empirically, HOGNs maintain classification performance with depth $L \sim 20$ , while standard (Euclidean or hyperbolic) GNNs degrade after four to five layers (Kang et al., 2023).

Learning $g_\psi(q)$ or $H_\theta$ confers geometric adaptivity: the model tailors the embedding space locally as Euclidean, hyperbolic, pseudoriemannian, or mixed, accommodating heterogeneous graph curvatures. This feature is essential for node embedding tasks on graphs with diverse intrinsic geometries, including those assembled from disjoint datasets (Kang et al., 2023). The Darboux theorem guarantees that learned symplectic forms can always be locally reduced to the canonical shape, explaining the empirical similarity between fixed-J and learnable symplectic variants (Kang et al., 2023).

6. Key Experimentation and Empirical Validation

HOGN and its variants have been evaluated on a spectrum of benchmark datasets:

Node classification: On low-hyperbolicity graphs (Disease, Airport), HOGNs achieved 91% accuracy versus 89% for the best HGCN. For high-hyperbolicity (Cora), performance matched state-of-the-art (≈82%). On mixed-geometry graphs, HOGN attained ≈95% versus 90% for the nearest baseline (Kang et al., 2023).
Link prediction: HOGN delivered 99.9% ROC-AUC on Airport and Citeseer, and 98.2% on Cora, outstripping all baselines including mixed-space and hyperbolic models (Kang et al., 2023).
Layerwise robustness: HOGN exhibited stable accuracy at large depth (up to 20+ layers), whereas GCN/HGCN collapsed (e.g., from 80% at 3 layers to 24% at 20 layers on Cora) (Kang et al., 2023).
Zero-shot generalization and physics: In physical simulation, HOGN generalizes to larger or hybrid graphs (e.g., training on 5-pendulum systems and testing on 10-pendulum or hybrid spring-pendulum graphs) without explicit retraining. Energy drift and momentum error remain orders-of-magnitude lower than baselines (Bishnoi et al., 2023, Sanchez-Gonzalez et al., 2019).
Symbolic discovery: The learned Hamiltonian decomposes into interpretable kinetic and potential terms. Symbolic regression on the outputted potentials or kinetic energies recovers exact or nearly exact closed-form laws (e.g., Hooke’s law, Lennard-Jones interactions) (Bishnoi et al., 2023).

7. Extensions, Limitations, and Comparative Landscape

Extensions include replacing the fixed Poisson matrix with a dataset-optimized symplectic form (as in SAH-GNN, which optimizes $S \in Sp(2n)$ via Riemannian gradient descent), supporting greater flexibility in data-driven geometry. Limitations cited include the handling of only two-body interactions in certain models; learning non-separable Hamiltonians may partially limit full symplectic integration; and non-conservative system extensions require further work (Bishnoi et al., 2023, Sanchez-Gonzalez et al., 2019). In terms of computational costs, added ODE integration and symplectic matrix optimization scale as $O(T \cdot k^2)$ and $O(k^3)$ per layer, respectively (Liu et al., 2023).

Comparative analyses indicate that HOGN and SAH-GNN outperform Euclidean, hyperbolic, and mixed-manifold GNNs on a wide suite of tasks. Exact energy preservation and geometric expressivity are consistently cited as explaining the empirical superiority and robustness of HOGN variants (Kang et al., 2023, Liu et al., 2023).

References

"Node Embedding from Neural Hamiltonian Orbits in Graph Neural Networks" (Kang et al., 2023)
"Discovering Symbolic Laws Directly from Trajectories with Hamiltonian Graph Neural Networks" (Bishnoi et al., 2023)
"Hamiltonian Graph Networks with ODE Integrators" (Sanchez-Gonzalez et al., 2019)
"Symplectic Structure-Aware Hamiltonian (Graph) Embeddings" (Liu et al., 2023)