Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 58 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 13 tok/s Pro

GPT-5 High 15 tok/s Pro

GPT-4o 86 tok/s Pro

Kimi K2 208 tok/s Pro

GPT OSS 120B 447 tok/s Pro

Claude Sonnet 4 36 tok/s Pro

2000 character limit reached

PDHAMS: Efficient Discrete MCMC

Updated 2 August 2025

PDHAMS is a second-order MCMC algorithm that uses quadratic expansion and preconditioning to capture pairwise correlations in high-dimensional, correlated discrete targets.
It couples a Gaussian auxiliary variable trick with Hamiltonian momentum updates to enable efficient, rejection-free sampling for exactly quadratic potentials.
Careful tuning of parameters such as the preconditioning matrix, diagonal stabilization, and momentum scaling yields superior mixing and lower total variation distances compared to first-order methods.

Preconditioned Discrete-HAMS (PDHAMS) is a second-order, irreversible Markov chain Monte Carlo (MCMC) algorithm for discrete structured distributions. PDHAMS introduces a quadratic preconditioning step and a Hamiltonian momentum augmentation into the family of gradient-based discrete samplers, offering a significant advance in efficiency for high-dimensional or correlated discrete target distributions. The method combines (i) a quadratic expansion of the log-density, (ii) an auxiliary-variable construction based on the Gaussian integral trick to manage complex dependencies, and (iii) a Hamiltonian-based update with generalized detailed balance, yielding a rejection-free sampler for targets with exact quadratic potentials (Zhou et al., 29 Jul 2025).

1. Quadratic Preconditioning and Auxiliary Variable Mechanism

In contrast with first-order discrete samplers such as Norm Constrained Gradient (NCG) and Auxiliary Variable Gradient (AVG), which use only the gradient (linear expansion) of the log-density $f(s)$ , PDHAMS employs a second-order Taylor expansion:

$f(s) \approx f(s_t) + \nabla f(s_t)^\top (s - s_t) + \frac{1}{2} (s - s_t)^\top W (s - s_t),$

where $W$ is a global, positive definite preconditioning matrix representing curvature. This captures pairwise correlations missed by first-order approximations.

Direct sampling of the resulting discrete proposal is computationally intractable for generic $W$ due to induced pairwise interactions. PDHAMS resolves this by introducing a continuous auxiliary variable $z$ via the Gaussian integral trick:

$\pi(s, z) \propto \exp\{f(s)\} \, \exp\left( -\frac{1}{2}(z - s)^\top (W + D)(z - s) \right),$

where $D$ is diagonal, $W + D$ is positive definite, and $L$ is its Cholesky factor. Conditioning on $z$ yields factorized discrete proposals, enabling coordinate-wise efficient sampling.

Alternative but equivalent auxiliary variable schemes—“mean”, “variance”, or “momentum”—produce identical state transitions. The construction allows the state proposal $s^*$ to be efficiently sampled as

$Q(s \mid z_t, s_t) \propto \prod_{i=1}^d \operatorname{Softmax}\left[ -\frac{1}{2} d_i s_i^2 + \big(\nabla f(s_t)_i - (W s_t)_i + (W + D) z_t)_i \big) s_i \right].$

2. Hamiltonian Dynamics, Momentum Augmentation, and Irreversibility

PDHAMS augments the discrete state variable $s$ with a Gaussian momentum $u$ (or a scaled $v$ ), constructing a joint Hamiltonian target,

$\pi(s, u) \propto \exp\{f(s) - \tfrac{1}{2} \|u\|^2\}.$

The dynamics proceed as follows:

Auto-regression momentum update:

$v_{t+1/2} = \epsilon v_t + \sqrt{1 - \epsilon^2} L^{-1} Z, \quad Z \sim \mathcal{N}(0, I),$

for $\epsilon \in [0,1)$ .

Auxiliary variable:

$z_t = s_t + L^{-1} Z.$

Discrete proposal $s^* \sim Q(\cdot | z_t, s_t)$ as described above.
Irreversible momentum update with negation and gradient correction:

$v^* = -v_{t+1/2} + s_t - s^* + \phi\, [\nabla f(s^*) - \nabla f(s_t) + W(s_t - s^*)],$

where $\phi \ge 0$ controls the strength of the correction.

A generalized Metropolis–Hastings accept-reject step is performed:

$\alpha = \min\left\{ 1, \frac{\pi(s^*, -v^*)\, Q(s_t, -v_{t+1/2} | s^*, -v^*)}{\pi(s_t, v_{t+1/2})\, Q(s^*, v^* | s_t, v_{t+1/2})} \right\},$

and transitions satisfy the generalized detailed balance:

$\pi(s_t, v_{t+1/2}) K_\phi(s_{t+1}, v_{t+1} | s_t, v_{t+1/2}) = \pi(s_{t+1}, -v_{t+1}) K_\phi(s_t, -v_{t+1/2} | s_{t+1}, -v_{t+1}).$

This symmetry mirrors the irreversible dynamics of Hamiltonian Monte Carlo and ensures correct invariance of the target distribution while enabling rapid state space exploration.

3. Rejection-Free Sampling and Preconditioning Effects

A key property is that when $f(s)$ is exactly quadratic (so the target $\pi(s)$ is discrete Gaussian), the PDHAMS proposal kernel exactly matches the target, giving an acceptance probability of one—rejection-free sampling. In such cases, all proposals are accepted and the algorithm samples independently given the auxiliary variable structure.

For general $f(s)$ , the quadratic expansion and preconditioning via $W$ yield highly “informed” proposals that reflect local curvature, and numerical results indicate high acceptance rates and strong mixing even far from the quadratic ideal.

The matrix $W$ serves both as a global surrogate for the Hessian and as a preconditioner. Proper calibration of $W$ , along with selection of $D$ , $\epsilon$ , and $\phi$ , is essential to balance efficient traversal of state space with numerically stable, easily invertible auxiliary variable sampling.

4. Performance Comparisons in Numerical Experiments

Several experiments in (Zhou et al., 29 Jul 2025) demonstrate the empirical advantages of PDHAMS over prior state-of-the-art discrete MCMC methods:

Method	Approximation Order	Auxiliary Variable	Momentum	Irreversible	Rejection-Free (Quadratic)	TV Distance	ESS
NCG	1st	None	None	No	No	Higher	Low
AVG	1st	Yes	None	No	No	Higher	Low
DHAMS	1st	Yes	Yes	Yes	Yes (Linear only)	Moderate	Moderate
PDHAMS	2nd (Quadratic)	Yes (Gaussian)	Yes	Yes	Yes (Quadratic)	Lowest	High

In all tested cases—including discrete Gaussian, quadratic mixture, and clock Potts models—PDHAMS exhibits significantly lower total variation distance (TV) from the target and higher effective sample size (ESS) compared to NCG, AVG, and DHAMS. Autocorrelation in Markov chain trajectories is suppressed more rapidly, and estimated moments (means, variances) converge faster and with reduced bias.

NCG and AVG restrict their proposals to first-order information, neglecting important state-space dependencies. DHAMS improves on these by introducing auxiliary momentum and irreversible transitions but remains limited to first-order (linear) approximations, yielding rejection-free behavior only for targets with linear $f(s)$ . PDHAMS, by preconditioning with a global $W$ and using a quadratic expansion encapsulated in the auxiliary-variable framework, overcomes both limitations and achieves rejection-free behavior for quadratic potentials. For general $f(s)$ , the adaptive proposals retain higher fidelity to the local geometry of the target than NCG/AVG/DHAMS, resulting in superior mixing.

6. Implementation Considerations and Parameter Tuning

PDHAMS requires selection of several matrices and parameters:

$W$ : the global curvature approximation; common choices include the true or approximate Hessian of $f(s)$ .
$D$ : diagonal stabilization ensuring that $W + D$ is positive definite; required for Cholesky factorization.
$\epsilon$ : auto-regression parameter for the momentum process; interpolates between independent and persistent momentum.
$\phi$ : magnitude of the gradient correction in the momentum update.
$L$ : the lower Cholesky of $(W + D)$ , needed for efficient auxiliary variable generation.
(For over-relaxed PDHAMS variants) $\beta$ : controls the degree of negative correlation in state updates.

While parameter calibration can introduce implementation effort, the paper observes that modest tuning suffices to achieve strong performance across diverse discrete sampling scenarios.

7. Outlook and Significance

PDHAMS unifies and generalizes techniques from the discrete and continuous MCMC literature (notably Mirroring ideas from Hamiltonian Monte Carlo and Gaussian auxiliary variable tricks) in a framework that is robust to high-dimensional correlation, scale, and complex potentials. It achieves theoretically optimal mixing for discrete quadratic targets and extends these gains to broader target classes in practice. This makes PDHAMS a foundational methodology for future development of efficient discrete MCMC algorithms, particularly those requiring effective sampling from discrete graphical models, probabilistic combinatorial structures, or high-dimensional Bayesian posteriors with correlated latent variables.

The performance advantages over NCG, AVG, and DHAMS are consistently demonstrated through lower TV distances to target, increased ESS, reduced bias in moment estimation, and suppressed autocorrelation across varied discrete sampling tasks (Zhou et al., 29 Jul 2025).

PDF Markdown Chat (Pro)

References (1)

Preconditioned Discrete-HAMS: A Second-order Irreversible Discrete Sampler (2025)

Follow Topic

Get notified by email when new papers are published related to Preconditioned Discrete-HAMS (PDHAMS).