Acceptance-Rejection Reparameterization

Updated 3 February 2026

Acceptance-Rejection Reparameterization is a method that makes non-differentiable accept/reject decisions differentiable, enabling gradient-based optimization in probabilistic models.
It reduces variance in gradient estimators by using pathwise reparameterization and analytic differentiation, thereby enhancing efficiency in variational inference and MCMC.
Practical implementations in Gamma and Dirichlet distributions illustrate how tuning acceptance thresholds trades off between approximation fidelity and computational cost.

Acceptance-rejection reparameterization refers to a class of techniques that enable gradient-based learning and efficient sampling in the presence of acceptance-rejection mechanisms, particularly in variational inference (VI) and Markov Chain Monte Carlo (MCMC). These methods reparameterize the stochastic non-differentiable accept/reject decisions to provide differentiability or improve sampler efficiency, thus extending the toolkit for probabilistic modeling where standard reparameterization or straightforward MCMC falls short. This article covers foundational constructions, algorithmic elaboration, variance reduction properties, and representative applications, referencing key developments in variational inference and MCMC.

1. Acceptance-Rejection in Variational Inference

Classic variational inference methods use parametric variational families $q_\lambda(z)$ that are often limited in expressivity. Variational rejection sampling (VRS) (Jankowiak et al., 2023) expands this space by defining the variational density through a smoothed acceptance-rejection process. The VRS density is given by

$r_\lambda(z) = \frac{q_\lambda(z) a_\lambda(z)}{Z_\lambda}$

where $a_\lambda(z) = \sigma\left(\log p(x,z) - \log q_\lambda(z) + T\right)$ , $\sigma(\cdot)$ is the logistic function, $T$ is a scalar threshold, and $Z_\lambda$ is the normalization constant $E_{q_\lambda}[a_\lambda(z)]$ . Here, samples are proposed from $q_\lambda(z)$ and accepted probabilistically with $a_\lambda(z)$ . This construction yields a continuous, nonparametric variational family that reflects both the proposal and target densities, allowing controlled interpolation between variational and exact inference by tuning $T$ .

2. Pathwise Reparameterization for Low-Variance Gradients

A major challenge in optimizing objectives involving acceptance-rejection is the high variance of naïve score-function (REINFORCE) estimators. VRS originally relied on the covariance-based estimator

$\nabla_{\lambda} \mathcal{L} = \mathrm{Cov}_{r_\lambda(z)}\left[A(z), \nabla_\lambda \log\{q_\lambda(z)a_\lambda(z)\}\right]$

with $A(z) = \log p(x,z) - \log q_\lambda(z) - \log Z_\lambda$ . The acceptance-rejection reparameterization, introduced in Reparameterized VRS (RVRS) (Jankowiak et al., 2023) and RS-VI (Naesseth et al., 2016), leverages the existence of a deterministic, differentiable function $z = g_\lambda(\varepsilon)$ , $\varepsilon \sim p(\varepsilon)$ , to obtain pathwise gradients. By marginalizing over the accept/reject randomness and analytically differentiating through the smoothed or marginalized acceptance function, one obtains low-variance unbiased estimators suitable for black-box variational inference.

Key RVRS gradient (Proposition 1, Eq. 9 in (Jankowiak et al., 2023)):

$\nabla_\lambda \mathcal{L} = E_{r_\lambda(z)} \left[ \left( 2 a_\lambda(z) \frac{\partial a_\lambda(z)}{\partial z} + a_\lambda(z) \frac{\partial A(z)}{\partial z} \right) \nabla_\lambda z \right]$

This estimator combines the pathwise (reparameterization) component with analytic derivatives of the acceptance function, leading to practical implementation via automatic differentiation and variance reduction by an order of magnitude or more.

3. Reparameterization through Acceptance-Rejection Samplers

Many variational distributions, such as Gamma or Dirichlet, are intrinsically tied to acceptance-rejection samplers. Traditional reparameterization tricks fail due to the non-differentiable accept/reject logic. RS-VI (Naesseth et al., 2016) circumvents this by marginalizing out the accept/reject variable, thereby defining a smooth density $\pi(\varepsilon;\theta)$ for the auxiliary variable $\varepsilon$ . The gradient of the ELBO with respect to variational parameters $\theta$ decomposes as:

$\nabla_\theta \mathrm{ELBO}(\theta) = \mathbb{E}_\pi [\nabla_\theta f(h(\varepsilon;\theta))] + \mathbb{E}_\pi [f(h(\varepsilon;\theta)) \nabla_\theta \log(q(h;\theta)/r(h;\theta))] + \nabla_\theta H[q]$

where $h$ is the proposal-to-sample transform, $r$ is the proposal density, $q$ is the target (variational) density, and $H[q]$ is the entropy term. This approach has been instantiated for a wide array of common distributions, with closed-form derivatives for the correction term in the Gamma and Dirichlet settings.

4. Variance, Computational Cost, and Tradeoffs

Both RVRS (Jankowiak et al., 2023) and RS-VI (Naesseth et al., 2016) demonstrate significant variance reduction in gradient estimation compared to score-function estimators and previous reparameterization-based approaches. Empirical results confirm that RVRS can reduce variance by an order of magnitude or more, especially as dimensionality increases. The expected computational cost per accepted sample is inversely proportional to the acceptance rate $Z_\lambda$ . As $T$ is lowered (in VRS/RVRS), $a_\lambda(z)$ sculpts $r_\lambda(z)$ closer to $p(z|x)$ , improving ELBO tightness but decreasing $Z_\lambda$ and increasing per-sample cost. Increasing $T$ (i.e., $a_\lambda(z)\to 1$ ) recovers standard variational inference with lower computational cost but coarser approximations.

Method	Variance Reduction	Computational Cost per Sample
Score-fn	High variance	Moderate (no accept/reject)
RVRS/RS-VI	1–2 orders lower variance	$1/Z_\lambda$ samples per accept

5. Acceptance-Rejection Reparameterization in MCMC

Acceptance-rejection reparameterization has also been leveraged for Markov Chain Monte Carlo. Neal (Neal, 2020) analyzes the standard Metropolis-Hastings (MH) accept/reject step, where the Uniform(0,1) random variable $u$ determines acceptance. By augmenting the Markov state to include $u$ and updating $u$ non-reversibly rather than resampling each iteration, one obtains a non-reversible chain preserving the target marginal. The deterministic update of $u$ through a translation with wrap-around or persistent additive noise can reduce random walk behavior and improve sampling efficiency, especially for algorithms with persistent momentum or in mixed discrete-continuous models.

Empirical results indicate that for persistent Langevin and hybrid Gibbs/Langevin settings, this approach achieves up to a factor of two in sampling efficiency over HMC and reduces autocorrelation times for statistics of interest. Improvements for basic random-walk Metropolis in high dimension are more modest (10–20%).

6. Practical Implementation and Applications

For variational inference, RVRS and RS-VI directly enable black-box variational inference for continuous latent variable models and variational families defined via rejection samplers. Implementation involves drawing samples via the reparameterized proposal, applying the acceptance probability or marginalized density, and employing automatic differentiation through the acceptance-rejection logic, with gradient centering as variance reduction. Notable practical instantiations include Gamma and Dirichlet variational families and shape-augmentation tricks.

For MCMC, integrating the non-reversibly updated auxiliary variable into Metropolis or Langevin chains provides improved performance, notably when interleaving other updates (e.g., Gibbs for discrete variables) is important or when persistent momentum is used. The method is broadly applicable to any MH step and is compatible with external Gibbs or HMC moves.

7. Significance, Limitations, and Future Directions

Acceptance-rejection reparameterization techniques have expanded the reach of reparameterization-based stochastic optimization and improved MCMC mixing in various settings. The approach allows practitioners to employ flexible, nonparametric posterior approximations in VI and to optimize or sample efficiently from models otherwise recalcitrant to gradient methods. The main tradeoff is between the fidelity of the approximation and computational cost, controlled via acceptance rate. For MCMC, the non-reversible augmentation yields more efficient exploration for complex or hybrid models, though the gains for standard high-dimensional Metropolis–Hastings are relatively modest.

The methodology remains under active development, with ongoing extensions to broader variational families, more sophisticated samplers, and integration in automatic inference frameworks. The geometric interpretation of smoothed rejection and the modularity of the gradient estimation procedure position acceptance-rejection reparameterization as a foundational tool for scalable Bayesian inference (Jankowiak et al., 2023, Naesseth et al., 2016, Neal, 2020).

Markdown Upgrade to Chat

References (3)

Reparameterized Variational Rejection Sampling (2023)

Reparameterization Gradients through Acceptance-Rejection Sampling Algorithms (2016)

Non-reversibly updating a uniform [0,1] value for Metropolis accept/reject decisions (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Acceptance-Rejection Reparameterization.