Papers
Topics
Authors
Recent
2000 character limit reached

Riemannian-Manifold HMC

Updated 11 November 2025
  • RMHMC is a Markov Chain Monte Carlo method that utilizes a position-dependent metric, derived from the Fisher information or Hessian, to adapt to local curvature.
  • It integrates geometric insights into Hamiltonian dynamics by adjusting proposal trajectories, thereby reducing random-walk behavior in complex parameter spaces.
  • The approach employs a semi-implicit symplectic integrator to accurately simulate nonseparable Hamiltonians, offering superior effective sample sizes despite higher computational costs.

Riemannian-Manifold Hamiltonian Monte Carlo (RMHMC) is a Markov Chain Monte Carlo (MCMC) methodology that generalizes the Hybrid/Hamiltonian Monte Carlo (HMC) paradigm to exploit the local Riemannian geometry of the target distribution’s parameter space. RMHMC equips the parameter manifold with a smoothly varying metric tensor, typically derived from the Fisher information or the negative Hessian of the log-posterior, enabling proposal trajectories that are automatically adapted to local curvature. This approach enhances sampling efficiency, particularly in high-dimensional and highly correlated posteriors, by circumventing the need for costly pilot runs to calibrate proposal scales and by reducing random-walk behavior.

1. Mathematical Formulation on a Riemannian Manifold

Let θRD\theta \in \mathbb{R}^D denote the parameter vector of interest, with target posterior density π(θ)L(θ)π0(θ)\pi(\theta) \propto L(\theta)\,\pi_0(\theta), where LL is the likelihood and π0\pi_0 the prior. The parameter space is endowed with a position-dependent, symmetric positive-definite metric tensor G(θ)RD×DG(\theta)\in\mathbb{R}^{D\times D}. A momentum variable pRDp\in\mathbb{R}^D (conditionally distributed as pθN(0,G(θ))p|\theta \sim \mathcal{N}(0,G(\theta))) augments the parameter space, yielding an extended "Hamiltonian"

H(θ,p)=U(θ)+K(θ,p)H(\theta, p) = U(\theta) + K(\theta, p)

with

U(θ)=logπ(θ)+12logdetG(θ),K(θ,p)=12pG(θ)1p.U(\theta) = -\log\pi(\theta) + \frac{1}{2}\log\det G(\theta),\qquad K(\theta, p) = \frac{1}{2} p^\top G(\theta)^{-1}p.

This Hamiltonian structure ensures that the marginal distribution over θ\theta remains π(θ)\pi(\theta) under the dynamics, due to the volume correction +12logdetG+\frac{1}{2}\log\det G.

2. Riemannian Manifold Hamiltonian Dynamics

The stochastic dynamics evolve according to the Riemannian generalization of Hamilton's equations: θ˙=Hp=G(θ)1p,\dot\theta = \frac{\partial H}{\partial p} = G(\theta)^{-1}p,

p˙=Hθ=θlogπ(θ)12θlogG(θ)+12pθ[G(θ)1]p.\dot p = -\frac{\partial H}{\partial\theta} = \nabla_\theta \log\pi(\theta) - \frac{1}{2}\nabla_\theta\log|G(\theta)| + \frac{1}{2}p^\top\nabla_\theta [G(\theta)^{-1}]p.

The first two terms (θlogπ(θ)\nabla_\theta\log\pi(\theta) and 12θlogG(θ)-\frac{1}{2}\nabla_\theta\log|G(\theta)|) correspond to the natural gradient and volume correction, while the final term (+12pθ[G(θ)1]p+\frac{1}{2}p^\top\nabla_\theta [G(\theta)^{-1}]p) accounts for the metric’s local variation and is a contraction involving third-order derivatives. The coupled ODEs trace geodesic flows under the metric G(θ)G(\theta).

3. Metric Tensor Choices and Geometric Adaptivity

Typical metric choices:

  • Expected Fisher Information: G(θ)=Ey[θ2logL(yθ)]G(\theta) = -\mathbb{E}_{y}[\nabla^2_\theta\log L(y|\theta)].
  • Observed Information (regularized): G(θ)=θ2logL(θ)+ϵIG(\theta) = -\nabla^2_\theta\log L(\theta) + \epsilon I, with small ϵ>0\epsilon>0 for positive definiteness.

The metric tensor G(θ)G(\theta) re-scales the local geometry: directions with large eigenvalues (high curvature) admit smaller step sizes, enhancing stability and efficiency in regions of strong anisotropy. In flat directions, larger steps promote rapid exploration. The approach removes the need for global scaling heuristics and pilot adaptation runs as in traditional HMC.

4. Symplectic Integrator for Nonseparable Hamiltonians

The Hamiltonian H(θ,p)H(\theta, p) is nonseparable due to the position-dependence of G(θ)G(\theta), precluding direct use of the explicit leapfrog integrator. The method employs a semi-implicit, time-reversible, second-order symplectic integrator. For each integration step of size ϵ\epsilon:

  1. Momentum half-step (explicit):

ppϵ2θU(θ)p \gets p - \frac{\epsilon}{2}\nabla_\theta U(\theta)

  1. Position update (typically implicit):

θθ+ϵG(θ)1p\theta \gets \theta + \epsilon\,G(\theta)^{-1}p

This may require a fixed-point or Newton solve due to the nonlinearity from G(θ)G(\theta).

  1. Metric update and corresponding derivatives.
  2. Second momentum half-step:

ppϵ2θU(θ)p \gets p - \frac{\epsilon}{2}\nabla_\theta U(\theta)

This update is repeated LL times per trajectory. All metric derivatives θG\nabla_\theta G and θlogdetG\nabla_\theta\log\det G must be available in closed form or computed via autodifferentiation.

5. Full RMHMC Algorithm and Practical Implementation

The complete iteration is as follows:

  1. Compute G(θ(t1))G(\theta^{(t-1)}) and θU(θ(t1))\nabla_\theta U(\theta^{(t-1)}).
  2. Sample pN(0,G(θ(t1)))p\sim\mathcal{N}(0, G(\theta^{(t-1)})).
  3. Integrate (θ,p)(\theta, p) via LL generalized leapfrog steps as above.
  4. Compute the Hamiltonian difference ΔH=H(θ,p)H(θ(t1),pinitial)\Delta H = H(\theta, p) - H(\theta^{(t-1)}, p_\mathrm{initial}).
  5. Accept the proposal with probability min{1,exp(ΔH)}\min\{1, \exp(-\Delta H)\}; otherwise retain the previous state.

Key computational aspects:

  • Computing G(θ)G(\theta) and G(θ)1G(\theta)^{-1} per step is O(D2)O(D^2) and O(D3)O(D^3).
  • Calculating metric derivatives (θG\nabla_\theta G) is O(D3)O(D^3).
  • Step 2 (position update) typically requires efficient linear algebra, including Cholesky factorization or parallelization, especially in moderate (D102D \sim 10^2) dimensions.
  • For very large DD, use sparse, low-rank, or block-diagonal approximations to G(θ)G(\theta). Cholesky caching, partial metric updates, and structure-exploiting linear algebra are advised.

6. Practical Performance and Empirical Results

Empirical studies by Girolami & Calderhead demonstrate RMHMC’s superiority on a range of models:

  • Logistic regression: faster mixing in moderate dimensions.
  • Log-Gaussian Cox process: efficient high-dimensional GP latent variable inference (D100D \sim 100).
  • Stochastic volatility models: improved performance on latent time-series models.
  • Bayesian inference on ODE parameters: rapid convergence and exploration.

Reported time-normalized Effective Sample Size (ESS) improvements over standard HMC and Random Walk Metropolis range from a factor of 2 to 10, particularly in strongly anisotropic or highly correlated posteriors.

7. Advantages, Limitations, and Applicability

Advantages:

  • Automatic adaptation to local curvature and anisotropy.
  • Suppression of random-walk behavior in challenging geometries.
  • Greater statistical efficiency (higher ESS per unit time) for targets with rapidly varying or strongly correlated scales.

Limitations:

  • High per-step computational expense—each step is dominated by O(D3)O(D^3) metric computation, inversion, and metric derivative evaluation.
  • Complex implementation—implicit integrator steps and metric derivatives make coding nontrivial.
  • Scalability is restricted in very high dimensions unless conditional independence, sparsity, or low-rank structure in G(θ)G(\theta) is exploited.

Applicability: RMHMC is most effective for moderate- to high-dimensional (tens to a few hundreds of dimensions) hierarchical Bayesian models exhibiting severe posterior anisotropy, strong dependence, or curved geometries that challenge conventional HMC or Metropolis approaches.


For algorithmic details, see the provided MATLAB code linked by the original authors, which is organized to replicate all main results and serve as reference for efficient RMHMC implementations (0907.1100).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Riemannian-Manifold Hamiltonian Monte Carlo (RMHMC).