Hamiltonian Monte Carlo (HMC) Method

Updated 16 August 2025

Hamiltonian Monte Carlo (HMC) is a Markov Chain Monte Carlo method that augments the parameter space with momentum variables to enable efficient sampling from high-dimensional distributions.
It uses a symplectic geometric framework to construct reversible and volume-preserving transitions, optimizing global exploration and reducing autocorrelation.
By incorporating local metric adaptation and energy conservation, HMC achieves robust Bayesian inference and practical efficiency gains in challenging sampling scenarios.

Hamiltonian Monte Carlo (HMC) is a Markov Chain Monte Carlo (MCMC) method that leverages Hamiltonian dynamics derived from symplectic geometry to construct efficient, reversible, and measure-preserving Markov transition kernels for sampling from complex probability distributions. HMC is characterized by its augmentation of the parameter space with auxiliary momentum variables, enabling deterministic and volume-preserving exploration of the extended phase space via Hamilton’s equations. This geometric construction allows HMC to propose long-range, coherent moves aligned with the structure of the target distribution, resulting in rapid mixing and reduced autocorrelation compared to random-walk-based MCMC methods. The mathematical framework primes HMC for rigorous Bayesian inference in high-dimensional, highly correlated, or multimodal settings, providing both theoretical guarantees and practical efficiency gains.

1. Symplectic Geometry and Hamiltonian Dynamics

At the foundational level, HMC operates on the cotangent bundle of the parameter manifold, introducing position coordinates $q^i$ and conjugate momenta $p_i$ . The symplectic structure is defined by the symplectic form

$\omega = d\theta = \sum_{i=1}^n dq^i \wedge dp_i$

where $\theta = -\sum_{i=1}^n p_i\,dq^i$ is the tautological one-form. Given a smooth Hamiltonian function $H(q,p)$ , Hamilton’s equations of motion arise from the identification of a vector field $X_H$ such that

$\omega(X_H,\cdot) = dH(\cdot)$

resulting in the familiar system

$\dot{q}^i = +\frac{\partial H}{\partial p_i}, \quad \dot{p}_i = -\frac{\partial H}{\partial q^i}$

Liouville’s theorem ensures the flow conserves the phase space volume form $\Omega = \omega^n$ , vital for the invariance of probability measures under the dynamics.

2. Construction of the Markov Transition Kernel

HMC constructs a joint target measure

$\pi(q,p) \propto \exp[-H(q,p)] \, \Omega$

The Hamiltonian flow preserves both $H$ (up to numerical error) and $\Omega$ , delivering a deterministic, reversible evolution. Reversibility and detailed balance are enforced via a momentum flip at the trajectory endpoint, typically paired with a Metropolis–Hastings acceptance step to correct for discretization error. This mechanism ensures that transitions are measure-preserving and avoid the diffusive exploration associated with random-walk proposals, achieving efficient global movement through the target distribution while maintaining the necessary stationarity and ergodicity properties.

3. General Form and Role of the Hamiltonian

The admissible class of Hamiltonians is determined by symplectic invariance and Markov chain requirements. The general form is

$H(q, p) = -\log \pi(q, p) + C$

For practically relevant cases where the extended target factorizes into a marginal for positions and a conditional for momenta,

$\pi(q, p) = \pi(q) \pi(p \mid q)$

the Hamiltonian decomposes as

$H(q, p) = T(q, p) + V(q) + C$

with $V(q) = -\log \pi(q)$ and $T(q, p) = -\log \pi(p \mid q)$ . Most implementations use a Gaussian conditional for momenta, in which case $T$ is quadratic (up to a log-determinant normalization ensuring covariance transformation under reparametrization):

$T(q, p) = \frac{1}{2} \sum_{i,j} p_i p_j \Lambda^{ij}(q) - \frac{1}{2} \log |\Lambda(q)|$

where $\pi(p \mid q) = \mathcal{N}(p \mid 0,\, \Lambda^{-1}(q))$ . Key conditions such as $T(q, -p) = T(q, p)$ guarantee reversibility, while normalization secures proper Markov kernel behavior.

4. Bayesian Inference and Practical Efficiency

In Bayesian computation, the role of HMC is to sample from posterior distributions $\pi(q)$ that may exhibit challenging features—strong correlations, high dimensionality, or multiple modes. The geometric perspective yields several advantages:

Geometry-aware Dynamics: Hamiltonian flows align with isocontours of the (negative log) target density, enabling rapid, non-diffusive exploration across high-probability regions.
Local Metric Adaptation: The kinetic energy's covariance structure ( $\Lambda(q)$ ), interpretable as a Riemannian metric, can be adapted to the local curvature of the posterior, facilitating proposals that are resilient to parameter scaling and non-Euclidean geometry.
Volume and Energy Conservation: By conserving the phase volume and (approximately) the Hamiltonian, the kernel achieves detailed balance and robust stationary convergence. Any numerical violations are corrected by the accept/reject step.

These properties produce Markov chains with substantially reduced autocorrelation and increased efficiency in high-dimensional sampling relative to basic MCMC. The systematic, gradient-driven trajectories allow HMC to scale more gracefully with dimensionality and challenge classically intractable Bayesian models.

5. Summary and Theoretical Guarantees

The symplectic geometric underpinnings of Hamiltonian dynamics provide a rigorous constructive basis for HMC. The resulting Markov transition kernel simultaneously:

Preserves probability measure (Liouville’s theorem)
Ensures reversibility and detailed balance (Hamilton’s equations, momentum reflections)
Supports general Hamiltonian formulation: $H(q, p) = T(q, p) + V(q) + \text{const}$ , commonly with

$T(q, p) = \frac{1}{2} \sum_{i,j} p_i p_j \Lambda^{ij}(q) - \frac{1}{2} \log |\Lambda(q)|, \quad V(q) = -\log \pi(q)$

When instantiated for Bayesian inference, this design allows efficient sampling by following "geodesic" paths in the parameter space, and the freedom to adapt or choose $\Lambda(q)$ offers direct control over sampler efficiency and accuracy.

In conclusion, the symplectic geometric formulation of HMC encodes the necessary structure for accurate and efficient MCMC in complex inference scenarios, combining the invariance, reversibility, and volume-preserving properties critical for high-fidelity, scalable sampling (Betancourt et al., 2011).

PDF Markdown Chat (Pro)

References (1)

The Geometry of Hamiltonian Monte Carlo (2011)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Hamiltonian Monte Carlo (HMC) Method.