Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
96 tokens/sec
Gemini 2.5 Pro Premium
42 tokens/sec
GPT-5 Medium
20 tokens/sec
GPT-5 High Premium
27 tokens/sec
GPT-4o
100 tokens/sec
DeepSeek R1 via Azure Premium
86 tokens/sec
GPT OSS 120B via Groq Premium
464 tokens/sec
Kimi K2 via Groq Premium
181 tokens/sec
2000 character limit reached

Hamiltonian Monte Carlo (HMC) Method

Updated 16 August 2025
  • Hamiltonian Monte Carlo (HMC) is a Markov Chain Monte Carlo method that augments the parameter space with momentum variables to enable efficient sampling from high-dimensional distributions.
  • It uses a symplectic geometric framework to construct reversible and volume-preserving transitions, optimizing global exploration and reducing autocorrelation.
  • By incorporating local metric adaptation and energy conservation, HMC achieves robust Bayesian inference and practical efficiency gains in challenging sampling scenarios.

Hamiltonian Monte Carlo (HMC) is a Markov Chain Monte Carlo (MCMC) method that leverages Hamiltonian dynamics derived from symplectic geometry to construct efficient, reversible, and measure-preserving Markov transition kernels for sampling from complex probability distributions. HMC is characterized by its augmentation of the parameter space with auxiliary momentum variables, enabling deterministic and volume-preserving exploration of the extended phase space via Hamilton’s equations. This geometric construction allows HMC to propose long-range, coherent moves aligned with the structure of the target distribution, resulting in rapid mixing and reduced autocorrelation compared to random-walk-based MCMC methods. The mathematical framework primes HMC for rigorous Bayesian inference in high-dimensional, highly correlated, or multimodal settings, providing both theoretical guarantees and practical efficiency gains.

1. Symplectic Geometry and Hamiltonian Dynamics

At the foundational level, HMC operates on the cotangent bundle of the parameter manifold, introducing position coordinates qiq^i and conjugate momenta pip_i. The symplectic structure is defined by the symplectic form

ω=dθ=i=1ndqidpi\omega = d\theta = \sum_{i=1}^n dq^i \wedge dp_i

where θ=i=1npidqi\theta = -\sum_{i=1}^n p_i\,dq^i is the tautological one-form. Given a smooth Hamiltonian function H(q,p)H(q,p), Hamilton’s equations of motion arise from the identification of a vector field XHX_H such that

ω(XH,)=dH()\omega(X_H,\cdot) = dH(\cdot)

resulting in the familiar system

q˙i=+Hpi,p˙i=Hqi\dot{q}^i = +\frac{\partial H}{\partial p_i}, \quad \dot{p}_i = -\frac{\partial H}{\partial q^i}

Liouville’s theorem ensures the flow conserves the phase space volume form Ω=ωn\Omega = \omega^n, vital for the invariance of probability measures under the dynamics.

2. Construction of the Markov Transition Kernel

HMC constructs a joint target measure

π(q,p)exp[H(q,p)]Ω\pi(q,p) \propto \exp[-H(q,p)] \, \Omega

The Hamiltonian flow preserves both HH (up to numerical error) and Ω\Omega, delivering a deterministic, reversible evolution. Reversibility and detailed balance are enforced via a momentum flip at the trajectory endpoint, typically paired with a Metropolis–Hastings acceptance step to correct for discretization error. This mechanism ensures that transitions are measure-preserving and avoid the diffusive exploration associated with random-walk proposals, achieving efficient global movement through the target distribution while maintaining the necessary stationarity and ergodicity properties.

3. General Form and Role of the Hamiltonian

The admissible class of Hamiltonians is determined by symplectic invariance and Markov chain requirements. The general form is

H(q,p)=logπ(q,p)+CH(q, p) = -\log \pi(q, p) + C

For practically relevant cases where the extended target factorizes into a marginal for positions and a conditional for momenta,

π(q,p)=π(q)π(pq)\pi(q, p) = \pi(q) \pi(p \mid q)

the Hamiltonian decomposes as

H(q,p)=T(q,p)+V(q)+CH(q, p) = T(q, p) + V(q) + C

with V(q)=logπ(q)V(q) = -\log \pi(q) and T(q,p)=logπ(pq)T(q, p) = -\log \pi(p \mid q). Most implementations use a Gaussian conditional for momenta, in which case TT is quadratic (up to a log-determinant normalization ensuring covariance transformation under reparametrization):

T(q,p)=12i,jpipjΛij(q)12logΛ(q)T(q, p) = \frac{1}{2} \sum_{i,j} p_i p_j \Lambda^{ij}(q) - \frac{1}{2} \log |\Lambda(q)|

where π(pq)=N(p0,Λ1(q))\pi(p \mid q) = \mathcal{N}(p \mid 0,\, \Lambda^{-1}(q)). Key conditions such as T(q,p)=T(q,p)T(q, -p) = T(q, p) guarantee reversibility, while normalization secures proper Markov kernel behavior.

4. Bayesian Inference and Practical Efficiency

In Bayesian computation, the role of HMC is to sample from posterior distributions π(q)\pi(q) that may exhibit challenging features—strong correlations, high dimensionality, or multiple modes. The geometric perspective yields several advantages:

  • Geometry-aware Dynamics: Hamiltonian flows align with isocontours of the (negative log) target density, enabling rapid, non-diffusive exploration across high-probability regions.
  • Local Metric Adaptation: The kinetic energy's covariance structure (Λ(q)\Lambda(q)), interpretable as a Riemannian metric, can be adapted to the local curvature of the posterior, facilitating proposals that are resilient to parameter scaling and non-Euclidean geometry.
  • Volume and Energy Conservation: By conserving the phase volume and (approximately) the Hamiltonian, the kernel achieves detailed balance and robust stationary convergence. Any numerical violations are corrected by the accept/reject step.

These properties produce Markov chains with substantially reduced autocorrelation and increased efficiency in high-dimensional sampling relative to basic MCMC. The systematic, gradient-driven trajectories allow HMC to scale more gracefully with dimensionality and challenge classically intractable Bayesian models.

5. Summary and Theoretical Guarantees

The symplectic geometric underpinnings of Hamiltonian dynamics provide a rigorous constructive basis for HMC. The resulting Markov transition kernel simultaneously:

  • Preserves probability measure (Liouville’s theorem)
  • Ensures reversibility and detailed balance (Hamilton’s equations, momentum reflections)
  • Supports general Hamiltonian formulation: H(q,p)=T(q,p)+V(q)+constH(q, p) = T(q, p) + V(q) + \text{const}, commonly with

T(q,p)=12i,jpipjΛij(q)12logΛ(q),V(q)=logπ(q)T(q, p) = \frac{1}{2} \sum_{i,j} p_i p_j \Lambda^{ij}(q) - \frac{1}{2} \log |\Lambda(q)|, \quad V(q) = -\log \pi(q)

When instantiated for Bayesian inference, this design allows efficient sampling by following "geodesic" paths in the parameter space, and the freedom to adapt or choose Λ(q)\Lambda(q) offers direct control over sampler efficiency and accuracy.

In conclusion, the symplectic geometric formulation of HMC encodes the necessary structure for accurate and efficient MCMC in complex inference scenarios, combining the invariance, reversibility, and volume-preserving properties critical for high-fidelity, scalable sampling (Betancourt et al., 2011).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)