Papers
Topics
Authors
Recent
Search
2000 character limit reached

Pseudo-Marginal Metropolis–Hastings (PMMH)

Updated 3 February 2026
  • PMMH is a Markov chain Monte Carlo method that uses unbiased estimators to perform exact Bayesian inference on models with intractable likelihoods.
  • It employs techniques such as signed, block–Poisson, and correlated estimators to efficiently handle doubly intractable problems and reduce variance.
  • Practical implementations of PMMH require careful tuning of block sizes and estimator parameters, as demonstrated in applications like Ising and Kent models.

A pseudo-marginal Metropolis–Hastings (PMMH) algorithm is a Markov chain Monte Carlo (MCMC) method for exact inference in models with intractable likelihoods, where only unbiased non-negative estimators of the unnormalized target density are available. PMMH encompasses a class of algorithms, including standard, correlated, and signed estimators for settings such as doubly intractable distributions. These algorithms have been extensively studied for both theory and high-dimensional applications, notably in latent variable models, doubly intractable posteriors, and @@@@1@@@@ (Yang et al., 2022).

1. General Structure and Exactness

Let yy denote the observed data, θ\theta the parameter of interest, and π(θ)\pi(\theta) its prior. If L(θ)=p(yθ)L(\theta) = p(y \mid \theta) is intractable but one can compute an unbiased estimator L^(θ;u)\widehat L(\theta; u) using an auxiliary random vector up(u)u \sim p(u) with Eu[L^(θ;u)]=L(θ)\mathbb{E}_{u}[\widehat L(\theta; u)] = L(\theta), define the extended target as

Π^(θ,u)L^(θ;u)π(θ)p(u)\widehat{\Pi}(\theta, u) \propto \widehat{L}(\theta; u)\, \pi(\theta)\,p(u)

The PMMH transition (θ,u)(θ,u)(\theta, u) \rightarrow (\theta', u') accepts with probability

α=min{1,  L^(θ;u)π(θ)q(θθ)L^(θ;u)π(θ)q(θθ)}\alpha = \min \left\{1,\; \frac{\widehat{L}(\theta'; u')\,\pi(\theta')\, q(\theta \mid \theta')}{\widehat{L}(\theta; u)\, \pi(\theta)\, q(\theta' \mid \theta)} \right\}

The marginal chain in θ\theta leaves the exact posterior π(θy)L(θ)π(θ)\pi(\theta \mid y) \propto L(\theta)\pi(\theta) invariant due to unbiasedness and standard MH theory (Yang et al., 2022).

2. Signed and Block–Poisson Estimators for Doubly Intractable Models

For posteriors where the likelihood includes an intractable normalizing constant Z(θ)Z(\theta), the doubly intractable posterior is

π(θy)f(yθ)π(θ)Z(θ)\pi(\theta \mid y) \propto \frac{f(y \mid \theta)\, \pi(\theta)}{Z(\theta)}

A signed unbiased estimator for expressions such as exp(νZ(θ))\exp(-\nu Z(\theta)) is constructed using the block–Poisson (BP) scheme. Introduce νExponential(Z(θ))\nu \sim \mathrm{Exponential}(Z(\theta)) and partition the estimator: E^(θ,νu)==1λexp(a/λ+m)h=1χB^(h,)(θ,ν)amλ\widehat E(\theta, \nu \mid u) = \prod_{\ell=1}^{\lambda} \exp(a/\lambda + m) \prod_{h=1}^{\chi_\ell} \frac{\widehat B^{(h,\ell)}(\theta, \nu) - a}{m\lambda} with χPoisson(m)\chi_\ell \sim \mathrm{Poisson}(m) and each B^\widehat B an unbiased estimator of νZ(θ)- \nu Z(\theta). Since E^\widehat E can be negative, PMMH targets Π^(θ,u,ν)|\widehat{\Pi}(\theta, u, \nu)|, collecting sign s(i)=sign(E^)s^{(i)} = \operatorname{sign}(\widehat E) at each iteration and using the importance-sampling corrected estimator: E^π[φ]=i=1Nφ(θ(i))s(i)i=1Ns(i)Eπ[φ(θ)]\widehat{\mathbb{E}}_{\pi}[\varphi] = \frac{\sum_{i=1}^N \varphi(\theta^{(i)}) s^{(i)}}{\sum_{i=1}^N s^{(i)}} \to \mathbb{E}_\pi[\varphi(\theta)] as NN \to \infty (Yang et al., 2022).

3. Correlated PMMH and Blockwise Updates

Variance in the log-likelihood ratio degrades PMMH mixing as dimension/amount of data increases, motivating correlated PMMH approaches (Deligiannidis et al., 2015, Yang et al., 2022, Dahlin et al., 2015). By partitioning uu into λ\lambda blocks and only refreshing one block per iteration, the correlation between log estimators at current and proposed states becomes

Corr(logE^(θ,u),logE^(θ,u))1λ1\operatorname{Corr}(\log |\widehat E(\theta, u)|, \log|\widehat E(\theta', u')|) \approx 1 - \lambda^{-1}

resulting in much lower variance of the acceptance ratio: α=min{1,E^(θ,νu)f(yθ)π(θ)E^(θ,νu)f(yθ)π(θ)q(θθ)q(θθ)Z^P(θ)Z^P(θ)exp(νZ^P(θ))exp(νZ^P(θ))}\alpha = \min \left\{ 1,\, \frac{|\widehat E(\theta', \nu' \mid u')| f(y \mid \theta') \pi(\theta')}{|\widehat E(\theta, \nu \mid u)| f(y \mid \theta) \pi(\theta)} \frac{q(\theta \mid \theta')}{q(\theta' \mid \theta)} \frac{\widehat Z_P(\theta)}{\widehat Z_P(\theta')} \frac{\exp(-\nu \widehat Z_P(\theta))}{\exp(-\nu' \widehat Z_P(\theta'))} \right\} where Z^P()\widehat Z_P(\cdot) is a simple average of blockwise ZZ-estimates. The bias–correction terms guarantee unbiasedness for Z(θ)1Z(\theta)^{-1} even in correlated block updates (Yang et al., 2022).

4. Tuning and Hyperparameter Selection

Optimal performance relies on tuning the BP estimator's block count λ\lambda, block size mm, and per-block average MM using a computational time criterion: CT(m,λ,M)=mλMIF(σlogE^2(m,λ,M))(2τ(m,λ,M)1)2\mathrm{CT}(m, \lambda, M) = m \lambda M \frac{\mathrm{IF}(\sigma^2_{\log |\widehat E|}(m, \lambda, M))}{(2\tau(m, \lambda, M) - 1)^2} with inefficiency factor IF\mathrm{IF} depending on variance σlogE^2\sigma^2_{\log |\widehat E|} and τ=Pr[E^0]\tau=\Pr[\widehat E \ge 0] (probability the estimate is positive). Closed-form expressions for σ2\sigma^2 and τ\tau are available under normality assumptions. Practical guidelines: λ50\lambda \approx 50–100 for ρ0.98\rho \approx 0.98–0.99, m=1m=1, McγmaxM \approx c\cdot \gamma_{\max} with c=0.001c=0.001–0.005 such that σ21\sigma^2 \approx 1 and τ0.9\tau \approx 0.9 (Yang et al., 2022).

5. Empirical Performance and Model Classes

Empirical results for Ising models and Kent distribution models with intractable normalizing constants show that correlated PMMH with block–Poisson estimators offers a substantial increase in effective sample size per second compared to non-blocked or Russian roulette–style signed estimators; for the Ising model, correlated BP PMMH achieved ESS/sec of 0.5 versus 0.2 for Russian roulette (a twofold efficiency gain). For small or “nearly singular” parameter settings (e.g., small sample size, near-symmetry in the Kent model), Bayesian PMMH outperforms method-of-moments or saddlepoint-based MLE estimators in root mean squared error (Yang et al., 2022).

6. Theoretical Guarantees and Limitations

PMMH remains asymptotically exact for any doubly intractable model with an unbiased ZZ-estimator; no perfect sampling is required. The block–Poisson structure admits high correlation in likelihood ratios and admits closed-form rules for tuning. The main limitation is computational cost: each MH step requires multiple ZZ-estimates (via AIS or high-dimensional integrals). The presence of negative estimates in the signed setting increases variance, so tuning for high P[E^0]P[\widehat E \ge 0] is essential. Current open challenges include more efficient unbiased ZZ-estimators, adaptive block partitioning, and extension to streaming/big data contexts (Yang et al., 2022).

7. Practical Recommendations and Impact

  • PMMH with block–Poisson and correlated block updates offers simulation-consistent inference for doubly intractable models, outperforming alternative methods in simulation efficiency under a broad range of challenging scenarios.
  • Practical adoption requires careful tuning to ensure stable (signed) estimators and to control the computational cost per iteration.
  • Future work will need to address estimator design for more complex intractable models and develop further adaptive and scalable extensions for real-world data at massive scale (Yang et al., 2022).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Pseudo-Marginal Metropolis--Hastings (PMMH).