Papers
Topics
Authors
Recent
Search
2000 character limit reached

Hamiltonian Generative Networks (HGN)

Updated 16 March 2026
  • Hamiltonian Generative Networks are deep generative models that leverage Hamiltonian mechanics to impose symplectic, reversible latent dynamics for stable long-term predictions.
  • They employ a neural parameterization of a Hamiltonian function and use symplectic integrators, such as leapfrog schemes, to ensure volume preservation and exact likelihood evaluation.
  • HGNs have been demonstrated to outperform traditional methods by achieving lower mean-squared error and superior conservation of energy invariants in physical system modeling.

Hamiltonian Generative Networks (HGNs) are a class of deep generative models that leverage canonical Hamiltonian mechanics to impose symplectic and reversible dynamics in a latent space. HGNs and the closely related Symplectic Generative Networks (SGNs) operate by parameterizing a scalar Hamiltonian function with a neural network and integrating latent trajectories using symplectic numerical schemes. This ensures that the generative mapping is invertible, volume-preserving, and exhibits stable long-term dynamics. HGNs have been proposed as a physically-grounded alternative to recurrent models, variational autoencoders, and normalizing flows for both sequence modeling and invertible density estimation from high-dimensional observations such as images (Toth et al., 2019, Aich et al., 28 May 2025).

1. Hamiltonian Latent Dynamics and Symplectic Architecture

HGNs endow the latent space with a canonical symplectic structure via a pairing of variables z=(q,p)R2nz = (q, p) \in \mathbb{R}^{2n}, where q,pRnq, p \in \mathbb{R}^n denote “position” and “momentum” components, respectively. The symplectic form is defined by

ω=i=1ndqidpi,\omega = \sum_{i=1}^n dq_i \wedge dp_i,

or in matrix notation, using

J=(0In In0)J = \begin{pmatrix} 0 & I_n \ -I_n & 0 \end{pmatrix}

such that ω(u,v)=uJv\omega(u, v) = u^\top J v for tangent vectors u,vu, v.

A neural Hamiltonian Hψ:R2nRH_\psi: \mathbb{R}^{2n} \rightarrow \mathbb{R} parameterizes the energy landscape. Latent dynamics follow Hamilton’s equations: z˙(t)=JHψ(z(t)),z(0)=z0.\dot{z}(t) = J \nabla H_\psi(z(t)), \qquad z(0) = z_0. Numerical integration uses a symplectic scheme (e.g., leapfrog/Verlet) to advance (qt,pt)(qt+1,pt+1)(q_t, p_t) \rightarrow (q_{t+1}, p_{t+1}), guaranteeing approximate conservation of Hamiltonian invariants and symplectic volume (Toth et al., 2019, Aich et al., 28 May 2025).

2. Architecture and Training Objectives

The prototypical HGN employs three modules:

  • Encoder: A convolutional network (often ResNet-based) encodes an input sequence (x0,...,xT)(x_0, ..., x_T) to a posterior Gaussian over low-dimensional latent initial conditions S0=(q0,p0)S_0 = (q_0, p_0). The mean and diagonal covariance are learned functions of the input sequence, with reparameterization employed for stochasticity.
  • Hamiltonian Dynamics: The latent (q0,p0)(q_0, p_0) are evolved forward using the neural Hamiltonian and symplectic integrator for TT steps, producing a trajectory St=(qt,pt)S_t = (q_t, p_t).
  • Decoder: A deconvolutional network maps qtq_t (not ptp_t) back to observations xtx_t, reflecting the assumption that instantaneous images depend on position but not momentum.

The objective is a VAE-style evidence lower bound (ELBO), combining reconstruction and KL divergence: L=1T+1t=0TEqϕ(z0x0:T)[lnpθ(xtqt)]KL[qϕ(z0x0:T)N(0,I)],\mathcal{L} = \frac{1}{T+1}\sum_{t=0}^T \mathbb{E}_{q_\phi(z_0 | x_{0:T})} \left[ \ln p_\theta(x_t | q_t) \right] - \mathrm{KL}\left[ q_\phi(z_0 | x_{0:T}) \parallel \mathcal{N}(0, I) \right], where the qtq_t are obtained by integrating from z0z_0. (Toth et al., 2019, Higgins et al., 2021)

3. Invertibility, Volume Preservation, and Likelihood Evaluation

Hamiltonian flows are symplectic diffeomorphisms, preserving the Liouville volume form: detDΦHT(z)=1z.|\det D\Phi^T_H(z)| = 1 \qquad \forall z. Consequently, HGNs enable exact likelihood evaluation without the need for Jacobian determinant terms, unlike standard normalizing flows. The log-likelihood under a deterministic invertible flow simplifies to

logpX(x)=logpZ(z0)\log p_X(x) = \log p_Z(z_0)

where z0=(ΦHT)1(x)z_0 = (\Phi_H^T)^{-1}(x). For stochastic decoders or latent variable models, the change-of-variables formula reduces to an integral with no extra volume correction. This property enables efficient density estimation (Aich et al., 28 May 2025).

4. Empirical Performance, Metrics, and Evaluation

HGNs have demonstrated strong performance in modeling physical systems directly from image sequences over long horizons. When compared to alternatives such as the Hamiltonian Neural Network (HNN) applied to pixels (PixelHNN), HGNs achieve significantly lower reconstruction mean-squared error (MSE) and superior conservation of Hamiltonian invariants. For example, average test MSE (×10⁻⁴) over 30-step rollouts:

  • HNN: 6–120
  • PixelHNN: 37–120
  • HGN (leapfrog): 0.6–11

Energy drift (variance of learned Hamiltonian) is near zero for HGN, indicating successful conservation, while HNN typically collapses to constant-energy solutions (Toth et al., 2019). Enhancements in (Higgins et al., 2021) introduce the binary Symplecticity Metric (SyMetric), which combines regression accuracy (R2R^2) with a measure of (near-)symplecticity (Sym), revealing that only models with high SyMetric attain stable, interpretable, and physically meaningful latent dynamics over thousands of time steps.

Empirical Comparison Table: HGN vs HGN++

Dataset Rec MSE Ext MSE VPT Sym (MLP/PR) R² (MLP/PR) Model
Mass-spring 0.05 0.18 937 0.00*/0.00* 0.99*/1.00* HGN++
25.07 197.67 5.75 0.68/0.00 0.88/0.86 HGN
Mass-spring + c 1.63 2.77 567.5 0.03*/0.04* 0.96*/0.95* HGN++
25.49 126.34 8.5 0.78/0.00 0.52/0.50 HGN

*SyMetric=1 (R²>0.9, Sym<0.05) (Higgins et al., 2021).

5. Model Variants, Extensions, and Evaluation Metrics

The original HGN has been improved upon in several respects:

  • HGN++ replaces 2D latent tensors with vector, eliminates superfluous heads for (q,p)(q, p), adopts a deeper MLP Hamiltonian, switches from GECO to β\beta-VAE objective, and supervises both forward and backward rollouts, yielding dramatic increases in stable rollout lengths across 13 benchmark datasets (Higgins et al., 2021).
  • Neural Hamiltonian Flow (NHF): By treating each symplectic integration step as a layer in a flow, HGN is adapted for density modelling. NHF achieves expressivity comparable to RealNVP with fewer layers, and superior computational efficiency as the Jacobian contributions are all unity up to O(dt2)O(dt^2) corrections (Toth et al., 2019).

For quantitative assessment, pixel-MSE is a misleading metric, as it fails to discriminate between genuine dynamical learning and degenerate or time-encoding solutions. The SyMetric offers a more principled approach by measuring both informativeness and symplecticity of the mappings from learnt latents to true phase space (Higgins et al., 2021).

6. Theoretical Guarantees and Computational Properties

The SGN/HGN framework delivers several formal guarantees (Aich et al., 28 May 2025):

  • Invertibility and Volume Preservation: The flow ΦHT\Phi_H^T is a smooth diffeomorphism with unit Jacobian determinant, ensuring both reversibility and tractable likelihoods.
  • Universal Approximation: For any C1C^1 volume-preserving diffeomorphism on a compact set, there exists a neural Hamiltonian and sufficiently fine integration that approximates it to arbitrary precision, with error bound εSGN=O(T1n1/(2d))\varepsilon_{\mathrm{SGN}} = \mathcal{O}(T^{-1} n^{-1/(2d)}).
  • Complexity: Compared to VAEs (O(1)\mathcal{O}(1) per sample) and general normalizing flows (O(KCJ(d))\mathcal{O}(K C_J(d))), SGNs/HGNs operate at O(Td)\mathcal{O}(T d), with TT integration steps and without explicit Jacobian computations.
  • Information Geometry: For exponential-family latent priors, the Hamiltonian flow follows geodesics of the Fisher–Rao metric, unifying statistical and symplectic structures.
  • Stability: Symplectic integrators preserve modified Hamiltonians over exponentially long times; adaptive step size schemes control global error while maintaining structure.

7. Limitations and Ongoing Directions

HGNs, while successful on controlled physical domains, have several limitations and potential avenues for development:

  • Reliance on simulated or labelled pixel trajectories; extension to real-world or very high-dimensional partial observations may demand more powerful inference (e.g., attention, graph neural networks).
  • Integration of control signals for model-based RL and handling complex, non-separable or gauge-field Hamiltonians.
  • Extension of explicit symplecticity metrics (SyMetric) to Lagrangian-based models, and unification of metric-based and structure-based model selection.
  • A plausible implication is that further architectural advances guided by symplecticity criteria could yield models with interpretable, physically meaningful latents and robust extrapolative generalization (Higgins et al., 2021, Toth et al., 2019).

References

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Hamiltonian Generative Network (HGN).