Pseudo-Marginal Metropolis–Hastings (PMMH)

Updated 3 February 2026

PMMH is a Markov chain Monte Carlo method that uses unbiased estimators to perform exact Bayesian inference on models with intractable likelihoods.
It employs techniques such as signed, block–Poisson, and correlated estimators to efficiently handle doubly intractable problems and reduce variance.
Practical implementations of PMMH require careful tuning of block sizes and estimator parameters, as demonstrated in applications like Ising and Kent models.

A pseudo-marginal Metropolis–Hastings (PMMH) algorithm is a Markov chain Monte Carlo (MCMC) method for exact inference in models with intractable likelihoods, where only unbiased non-negative estimators of the unnormalized target density are available. PMMH encompasses a class of algorithms, including standard, correlated, and signed estimators for settings such as doubly intractable distributions. These algorithms have been extensively studied for both theory and high-dimensional applications, notably in latent variable models, doubly intractable posteriors, and @@@@1@@@@ (Yang et al., 2022).

1. General Structure and Exactness

Let $y$ denote the observed data, $\theta$ the parameter of interest, and $\pi(\theta)$ its prior. If $L(\theta) = p(y \mid \theta)$ is intractable but one can compute an unbiased estimator $\widehat L(\theta; u)$ using an auxiliary random vector $u \sim p(u)$ with $\mathbb{E}_{u}[\widehat L(\theta; u)] = L(\theta)$ , define the extended target as

$\widehat{\Pi}(\theta, u) \propto \widehat{L}(\theta; u)\, \pi(\theta)\,p(u)$

The PMMH transition $(\theta, u) \rightarrow (\theta', u')$ accepts with probability

$\alpha = \min \left\{1,\; \frac{\widehat{L}(\theta'; u')\,\pi(\theta')\, q(\theta \mid \theta')}{\widehat{L}(\theta; u)\, \pi(\theta)\, q(\theta' \mid \theta)} \right\}$

The marginal chain in $\theta$ leaves the exact posterior $\pi(\theta \mid y) \propto L(\theta)\pi(\theta)$ invariant due to unbiasedness and standard MH theory (Yang et al., 2022).

2. Signed and Block–Poisson Estimators for Doubly Intractable Models

For posteriors where the likelihood includes an intractable normalizing constant $Z(\theta)$ , the doubly intractable posterior is

$\pi(\theta \mid y) \propto \frac{f(y \mid \theta)\, \pi(\theta)}{Z(\theta)}$

A signed unbiased estimator for expressions such as $\exp(-\nu Z(\theta))$ is constructed using the block–Poisson (BP) scheme. Introduce $\nu \sim \mathrm{Exponential}(Z(\theta))$ and partition the estimator: $\widehat E(\theta, \nu \mid u) = \prod_{\ell=1}^{\lambda} \exp(a/\lambda + m) \prod_{h=1}^{\chi_\ell} \frac{\widehat B^{(h,\ell)}(\theta, \nu) - a}{m\lambda}$ with $\chi_\ell \sim \mathrm{Poisson}(m)$ and each $\widehat B$ an unbiased estimator of $- \nu Z(\theta)$ . Since $\widehat E$ can be negative, PMMH targets $|\widehat{\Pi}(\theta, u, \nu)|$ , collecting sign $s^{(i)} = \operatorname{sign}(\widehat E)$ at each iteration and using the importance-sampling corrected estimator: $\widehat{\mathbb{E}}_{\pi}[\varphi] = \frac{\sum_{i=1}^N \varphi(\theta^{(i)}) s^{(i)}}{\sum_{i=1}^N s^{(i)}} \to \mathbb{E}_\pi[\varphi(\theta)]$ as $N \to \infty$ (Yang et al., 2022).

3. Correlated PMMH and Blockwise Updates

Variance in the log-likelihood ratio degrades PMMH mixing as dimension/amount of data increases, motivating correlated PMMH approaches (Deligiannidis et al., 2015, Yang et al., 2022, Dahlin et al., 2015). By partitioning $u$ into $\lambda$ blocks and only refreshing one block per iteration, the correlation between log estimators at current and proposed states becomes

$\operatorname{Corr}(\log |\widehat E(\theta, u)|, \log|\widehat E(\theta', u')|) \approx 1 - \lambda^{-1}$

resulting in much lower variance of the acceptance ratio: $\alpha = \min \left\{ 1,\, \frac{|\widehat E(\theta', \nu' \mid u')| f(y \mid \theta') \pi(\theta')}{|\widehat E(\theta, \nu \mid u)| f(y \mid \theta) \pi(\theta)} \frac{q(\theta \mid \theta')}{q(\theta' \mid \theta)} \frac{\widehat Z_P(\theta)}{\widehat Z_P(\theta')} \frac{\exp(-\nu \widehat Z_P(\theta))}{\exp(-\nu' \widehat Z_P(\theta'))} \right\}$ where $\widehat Z_P(\cdot)$ is a simple average of blockwise $Z$ -estimates. The bias–correction terms guarantee unbiasedness for $Z(\theta)^{-1}$ even in correlated block updates (Yang et al., 2022).

4. Tuning and Hyperparameter Selection

Optimal performance relies on tuning the BP estimator's block count $\lambda$ , block size $m$ , and per-block average $M$ using a computational time criterion: $\mathrm{CT}(m, \lambda, M) = m \lambda M \frac{\mathrm{IF}(\sigma^2_{\log |\widehat E|}(m, \lambda, M))}{(2\tau(m, \lambda, M) - 1)^2}$ with inefficiency factor $\mathrm{IF}$ depending on variance $\sigma^2_{\log |\widehat E|}$ and $\tau=\Pr[\widehat E \ge 0]$ (probability the estimate is positive). Closed-form expressions for $\sigma^2$ and $\tau$ are available under normality assumptions. Practical guidelines: $\lambda \approx 50$ –100 for $\rho \approx 0.98$ –0.99, $m=1$ , $M \approx c\cdot \gamma_{\max}$ with $c=0.001$ –0.005 such that $\sigma^2 \approx 1$ and $\tau \approx 0.9$ (Yang et al., 2022).

5. Empirical Performance and Model Classes

Empirical results for Ising models and Kent distribution models with intractable normalizing constants show that correlated PMMH with block–Poisson estimators offers a substantial increase in effective sample size per second compared to non-blocked or Russian roulette–style signed estimators; for the Ising model, correlated BP PMMH achieved ESS/sec of 0.5 versus 0.2 for Russian roulette (a twofold efficiency gain). For small or “nearly singular” parameter settings (e.g., small sample size, near-symmetry in the Kent model), Bayesian PMMH outperforms method-of-moments or saddlepoint-based MLE estimators in root mean squared error (Yang et al., 2022).

6. Theoretical Guarantees and Limitations

PMMH remains asymptotically exact for any doubly intractable model with an unbiased $Z$ -estimator; no perfect sampling is required. The block–Poisson structure admits high correlation in likelihood ratios and admits closed-form rules for tuning. The main limitation is computational cost: each MH step requires multiple $Z$ -estimates (via AIS or high-dimensional integrals). The presence of negative estimates in the signed setting increases variance, so tuning for high $P[\widehat E \ge 0]$ is essential. Current open challenges include more efficient unbiased $Z$ -estimators, adaptive block partitioning, and extension to streaming/big data contexts (Yang et al., 2022).

7. Practical Recommendations and Impact

PMMH with block–Poisson and correlated block updates offers simulation-consistent inference for doubly intractable models, outperforming alternative methods in simulation efficiency under a broad range of challenging scenarios.
Practical adoption requires careful tuning to ensure stable (signed) estimators and to control the computational cost per iteration.
Future work will need to address estimator design for more complex intractable models and develop further adaptive and scalable extensions for real-world data at massive scale (Yang et al., 2022).

Markdown Report Issue Upgrade to Chat

References (3)

A correlated pseudo-marginal approach to doubly intractable problems (2022)

The Correlated Pseudo-Marginal Method (2015)

Accelerating pseudo-marginal Metropolis-Hastings by correlating auxiliary variables (2015)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Pseudo-Marginal Metropolis--Hastings (PMMH).