PDHAMS: Efficient Discrete MCMC
- PDHAMS is a second-order MCMC algorithm that uses quadratic expansion and preconditioning to capture pairwise correlations in high-dimensional, correlated discrete targets.
- It couples a Gaussian auxiliary variable trick with Hamiltonian momentum updates to enable efficient, rejection-free sampling for exactly quadratic potentials.
- Careful tuning of parameters such as the preconditioning matrix, diagonal stabilization, and momentum scaling yields superior mixing and lower total variation distances compared to first-order methods.
Preconditioned Discrete-HAMS (PDHAMS) is a second-order, irreversible Markov chain Monte Carlo (MCMC) algorithm for discrete structured distributions. PDHAMS introduces a quadratic preconditioning step and a Hamiltonian momentum augmentation into the family of gradient-based discrete samplers, offering a significant advance in efficiency for high-dimensional or correlated discrete target distributions. The method combines (i) a quadratic expansion of the log-density, (ii) an auxiliary-variable construction based on the Gaussian integral trick to manage complex dependencies, and (iii) a Hamiltonian-based update with generalized detailed balance, yielding a rejection-free sampler for targets with exact quadratic potentials (Zhou et al., 29 Jul 2025).
1. Quadratic Preconditioning and Auxiliary Variable Mechanism
In contrast with first-order discrete samplers such as Norm Constrained Gradient (NCG) and Auxiliary Variable Gradient (AVG), which use only the gradient (linear expansion) of the log-density , PDHAMS employs a second-order Taylor expansion:
where is a global, positive definite preconditioning matrix representing curvature. This captures pairwise correlations missed by first-order approximations.
Direct sampling of the resulting discrete proposal is computationally intractable for generic due to induced pairwise interactions. PDHAMS resolves this by introducing a continuous auxiliary variable via the Gaussian integral trick:
where is diagonal, is positive definite, and is its Cholesky factor. Conditioning on yields factorized discrete proposals, enabling coordinate-wise efficient sampling.
Alternative but equivalent auxiliary variable schemes—“mean”, “variance”, or “momentum”—produce identical state transitions. The construction allows the state proposal to be efficiently sampled as
2. Hamiltonian Dynamics, Momentum Augmentation, and Irreversibility
PDHAMS augments the discrete state variable with a Gaussian momentum (or a scaled ), constructing a joint Hamiltonian target,
The dynamics proceed as follows:
- Auto-regression momentum update:
for .
- Auxiliary variable:
- Discrete proposal as described above.
- Irreversible momentum update with negation and gradient correction:
where controls the strength of the correction.
- A generalized Metropolis–Hastings accept-reject step is performed:
and transitions satisfy the generalized detailed balance:
This symmetry mirrors the irreversible dynamics of Hamiltonian Monte Carlo and ensures correct invariance of the target distribution while enabling rapid state space exploration.
3. Rejection-Free Sampling and Preconditioning Effects
A key property is that when is exactly quadratic (so the target is discrete Gaussian), the PDHAMS proposal kernel exactly matches the target, giving an acceptance probability of one—rejection-free sampling. In such cases, all proposals are accepted and the algorithm samples independently given the auxiliary variable structure.
For general , the quadratic expansion and preconditioning via yield highly “informed” proposals that reflect local curvature, and numerical results indicate high acceptance rates and strong mixing even far from the quadratic ideal.
The matrix serves both as a global surrogate for the Hessian and as a preconditioner. Proper calibration of , along with selection of , , and , is essential to balance efficient traversal of state space with numerically stable, easily invertible auxiliary variable sampling.
4. Performance Comparisons in Numerical Experiments
Several experiments in (Zhou et al., 29 Jul 2025) demonstrate the empirical advantages of PDHAMS over prior state-of-the-art discrete MCMC methods:
Method | Approximation Order | Auxiliary Variable | Momentum | Irreversible | Rejection-Free (Quadratic) | TV Distance | ESS |
---|---|---|---|---|---|---|---|
NCG | 1st | None | None | No | No | Higher | Low |
AVG | 1st | Yes | None | No | No | Higher | Low |
DHAMS | 1st | Yes | Yes | Yes | Yes (Linear only) | Moderate | Moderate |
PDHAMS | 2nd (Quadratic) | Yes (Gaussian) | Yes | Yes | Yes (Quadratic) | Lowest | High |
In all tested cases—including discrete Gaussian, quadratic mixture, and clock Potts models—PDHAMS exhibits significantly lower total variation distance (TV) from the target and higher effective sample size (ESS) compared to NCG, AVG, and DHAMS. Autocorrelation in Markov chain trajectories is suppressed more rapidly, and estimated moments (means, variances) converge faster and with reduced bias.
5. Comparison to Related Gradient-Based Discrete Samplers
NCG and AVG restrict their proposals to first-order information, neglecting important state-space dependencies. DHAMS improves on these by introducing auxiliary momentum and irreversible transitions but remains limited to first-order (linear) approximations, yielding rejection-free behavior only for targets with linear . PDHAMS, by preconditioning with a global and using a quadratic expansion encapsulated in the auxiliary-variable framework, overcomes both limitations and achieves rejection-free behavior for quadratic potentials. For general , the adaptive proposals retain higher fidelity to the local geometry of the target than NCG/AVG/DHAMS, resulting in superior mixing.
6. Implementation Considerations and Parameter Tuning
PDHAMS requires selection of several matrices and parameters:
- : the global curvature approximation; common choices include the true or approximate Hessian of .
- : diagonal stabilization ensuring that is positive definite; required for Cholesky factorization.
- : auto-regression parameter for the momentum process; interpolates between independent and persistent momentum.
- : magnitude of the gradient correction in the momentum update.
- : the lower Cholesky of , needed for efficient auxiliary variable generation.
- (For over-relaxed PDHAMS variants) : controls the degree of negative correlation in state updates.
While parameter calibration can introduce implementation effort, the paper observes that modest tuning suffices to achieve strong performance across diverse discrete sampling scenarios.
7. Outlook and Significance
PDHAMS unifies and generalizes techniques from the discrete and continuous MCMC literature (notably Mirroring ideas from Hamiltonian Monte Carlo and Gaussian auxiliary variable tricks) in a framework that is robust to high-dimensional correlation, scale, and complex potentials. It achieves theoretically optimal mixing for discrete quadratic targets and extends these gains to broader target classes in practice. This makes PDHAMS a foundational methodology for future development of efficient discrete MCMC algorithms, particularly those requiring effective sampling from discrete graphical models, probabilistic combinatorial structures, or high-dimensional Bayesian posteriors with correlated latent variables.
The performance advantages over NCG, AVG, and DHAMS are consistently demonstrated through lower TV distances to target, increased ESS, reduced bias in moment estimation, and suppressed autocorrelation across varied discrete sampling tasks (Zhou et al., 29 Jul 2025).