Papers
Topics
Authors
Recent
2000 character limit reached

Hamiltonian Monte Carlo Sampling

Updated 27 November 2025
  • Hamiltonian Monte Carlo Sampling is a method that employs Hamiltonian dynamics to generate samples from complex probability distributions in high-dimensional spaces.
  • It uses a leapfrog integrator to approximate continuous dynamics, ensuring detailed balance and preservation of phase-space volume.
  • Advanced variants integrate Riemannian geometry, neural network approximations, and subsampling techniques to enhance efficiency and handle challenging posterior landscapes.

Hamiltonian Monte Carlo Sampling is a Markov chain Monte Carlo (MCMC) framework that generates samples from a target probability distribution by simulating Hamiltonian dynamics in an augmented phase space. This technique exploits the physical analogy between statistical sampling and the motion of a particle in a conservative system, allowing for coherent traversal of complex, high-dimensional parameter spaces. Hamiltonian Monte Carlo (HMC) algorithms have become foundational in Bayesian inference and machine learning, particularly for problems with intricate posterior geometry, strong correlations, or severe multimodality.

1. Standard Hamiltonian Monte Carlo: Formulation and Properties

In canonical HMC, the target distribution over parameters θRd\theta \in \mathbb{R}^d is written as π(θ)exp(U(θ))\pi(\theta) \propto \exp(-U(\theta)), with U(θ)U(\theta) the potential energy, typically the negative log-posterior or unnormalized log-density. An auxiliary momentum variable pRdp \in \mathbb{R}^d is introduced, leading to the joint density π(θ,p)exp(H(θ,p))\pi(\theta, p) \propto \exp(-H(\theta, p)) with Hamiltonian H(θ,p)=U(θ)+K(p)H(\theta, p) = U(\theta) + K(p); the kinetic energy K(p)=12pM1pK(p) = \frac{1}{2}p^\top M^{-1}p uses a user-specified mass matrix M0M \succ 0.

Hamilton's equations,

dθdt=M1p,dpdt=θU(θ),\frac{d\theta}{dt} = M^{-1}p, \qquad \frac{dp}{dt} = -\nabla_\theta U(\theta),

describe deterministic motion in phase space. The leapfrog (Störmer–Verlet) integrator discretizes these dynamics: pn+12=pnϵ2U(θn), θn+1=θn+ϵM1pn+12, pn+1=pn+12ϵ2U(θn+1),\begin{aligned} p_{n+\frac12} &= p_n - \frac{\epsilon}{2}\nabla U(\theta_n), \ \theta_{n+1} &= \theta_n + \epsilon M^{-1}p_{n+\frac12}, \ p_{n+1} &= p_{n+\frac12} - \frac{\epsilon}{2}\nabla U(\theta_{n+1}), \end{aligned} where ϵ\epsilon is the step size. The proposal (θ,p)(\theta', p') at the end of each trajectory is accepted with probability min(1,exp[H(θ,p)H(θ,p)])\mathsf{min}\big(1, \exp[H(\theta, p) - H(\theta', p')]\big). This mechanism ensures detailed balance and preservation of the target density (Vishnoi, 2021).

Symplectic integrators are pivotal: they preserve phase-space volume (Liouville's theorem) and reversibility up to momentum flip. This guarantees, in the limit of vanishing discretization error, unbiased samples (Lelièvre et al., 2023).

2. Advanced Hamiltonian Flows: Riemannian and Non-Separable Formulations

For targets with non-Euclidean geometry, the kinetic energy can be generalized to use a position-dependent metric tensor G(θ)G(\theta), leading to Riemannian Manifold HMC (RMHMC). The Hamiltonian becomes

H(θ,p)=U(θ)+12logdetG(θ)+12pG1(θ)p,H(\theta, p) = U(\theta) + \frac{1}{2}\log\det G(\theta) + \frac{1}{2}p^\top G^{-1}(\theta)p,

and the equations of motion adapt to manifold curvature. Implicit or semi-explicit generalized leapfrog integrators (requiring fixed-point iterations in the momentum update) are employed (0907.1100). For generic non-separable Hamiltonians,

H(q,p)=V(q)+12pD(q)p,H(q, p) = V(q) + \frac{1}{2} p^\top D(q) p,

symplectic implicit integrators such as the implicit midpoint or generalized Störmer–Verlet rules are necessary. However, solver branches may break reversibility, necessitating explicit reversibility checks at each step to maintain unbiasedness (Lelièvre et al., 2023).

3. Computational Optimizations for Large-Scale and Expensive Gradients

Standard HMC is computationally intensive for large datasets, as each gradient U(θ)\nabla U(\theta) often requires summing terms over all NN datapoints. Two principal strategies have emerged:

Neural Network Gradient HMC (NNgHMC): The gradient U(θ)\nabla U(\theta) is approximated by a shallow feed-forward neural network trained on a finite set of exact gradients collected during burn-in. Subsequent trajectories use the fast neural approximation, coupled with the Metropolis correction for exactness. For N104N \gg 10^4 and d20d \ge 20, speed-ups of 340×3-40\times over baseline HMC are typical; limiting factors are network capacity and gradient error, which affect acceptance rates (Li et al., 2017).

Energy-Conserving Subsampling HMC (HMC-ECS): An unbiased estimator of the log-likelihood and gradient is computed on a random subsample uu of size mNm \ll N, with control variates for variance reduction. Each leapfrog trajectory is computed using the same subsample, preserving modified Hamiltonian invariance and enabling high acceptance rates even for N107N \sim 10^7 and m/N0.1%m/N \leq 0.1\%. Empirically, HMC-ECS achieves acceptance >95%>95\% and matching posterior estimates at 10×10\times or greater efficiency over stochastic-gradient alternatives (Dang et al., 2017).

4. Extensions to Non-Smooth, Multi-Modal, and Structured Spaces

Non-Smooth Energy Models: Sampling from distributions with non-differentiable potentials (e.g., 1\ell_1 regularization) is achieved via a proximal leapfrog update; gradients in the momentum update are replaced by proximity operator evaluations. This approach maintains reversibility, volume preservation, and robust sampling for, e.g., sparse recovery in imaging (Chaari et al., 2014).

Sampling in Discrete/Combinatorial Spaces: Probabilistic Path HMC (PPHMC) augments HMC to spaces such as orthant complexes (composite of glued Euclidean orthants) arising in phylogenetics and categorical models. A stochastic leapfrog (leap-prog) procedure traverses boundaries between orthants, permitting reflection and random re-selection among adjacent combinatorial states. Surrogate smoothing functions are employed for discontinuity and efficient mixing (Dinh et al., 2017).

Tempered/Multimodal Targets: Continuously-tempered HMC introduces a temperature variable β[0,1]\beta \in [0, 1] and momentum vv, bridging a base and target potential. Extended Hamiltonian dynamics allow smooth animation between models, boosting mixing across isolated modes and enabling unbiased normalizing constant estimation (Graham et al., 2017). Variational and quantum-inspired extensions further facilitate barrier crossing in multi-modal distributions via additional stochasticity in mass parameters or variational jumps (Liu et al., 2019, Gu et al., 2019).

5. Irreversible and Entropy-Adaptive Dynamics

Irreversible Algorithms: Modified HMC variants, such as Mix & Match HMC (MMHMC), supplant the standard Hamiltonian in the Metropolis test with higher-order "shadow" Hamiltonians and utilize partial momentum refreshment, leading to provably irreversible chains. Empirically, MMHMC attains 229×2-29\times speed-ups in time-normalized ESS over HMC, Generalized HMC, and RMHMC, especially in high dimensions (Radivojević et al., 2017). Hamiltonian Assisted Metropolis Sampling (HAMS) constructs generalized Metropolis updates with a momentum-augmented proposal, retaining rejection-free properties for Gaussian targets and yielding 1050×10-50\times better ESS per gradient than HMC for a wide variety of statistical models (Song et al., 2020).

Entropy-Based Adaptive HMC: Proposal entropy, approximated via log-determinant Jacobian measurements of the leapfrog map, is maximized to adapt the mass matrix in high dimensions, outperforming ESJD-based heuristics. This approach ensures joint exploration and superior mixing in poorly conditioned or anisotropic targets, matching or surpassing the mixing efficiency of RMHMC (Hirt et al., 2021).

6. Specialized Implementations and Empirical Performance

Field Sampling and Hierarchical Models: The HMCF framework implements HMC for high-dimensional random fields, integrating adaptive step-size, mass-matrix estimation, higher-order symplectic integrators, and direct joint sampling of hierarchical hyperparameters. Empirical performance scales robustly to spatially correlated inference tasks (Lienhard et al., 2018).

Challenges in State-Space and Latent Variable Models: Particle HMC (PHMC) integrates Sequential Monte Carlo (SMC) methods into each leapfrog step to estimate marginal likelihoods and log-posterior gradients for state-space models, obviating costly gradient calculations with respect to latent paths. PHMC outperforms particle Metropolis-Hastings with random-walk proposals in high-dimensional systems with intractable likelihoods (Amri et al., 14 Apr 2025).

Quantum-Inspired Techniques: Hamiltonian systems with random mass matrices (QHMC) encourage exploration of posteriors with sharp ridges or multi-modality, motivated by the energy-time uncertainty relation. QSGNHT further adapts this framework to stochastic gradient settings with thermostat-type variables, achieving superior mixing stability and rapid exploration in spiky or ill-conditioned models (Liu et al., 2019).

7. Practical Guidelines and Trade-offs

Selecting HMC or its advanced variations requires careful consideration of model structure, dimensionality, posterior curvature, computational hardware constraints, and expected mixing behavior. Leapfrog step size and trajectory length must be tuned for acceptance rates (typically 0.650.95\sim 0.65-0.95). For large NN or dd, techniques such as NNgHMC and HMC-ECS produce substantial efficiency gains once training or variance calibration costs are amortized. Extensions, such as RMHMC, MMHMC, entropy-adaptive HMC, and PHMC, are recommended where local geometry, irreducibility, or latent variable complexity dominate performance bottlenecks (Li et al., 2017, Dang et al., 2017, 0907.1100, Amri et al., 14 Apr 2025, Hirt et al., 2021, Radivojević et al., 2017, Song et al., 2020, Liu et al., 2019).


References

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Hamiltonian Monte Carlo Sampling.