Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 161 tok/s

Gemini 2.5 Pro 52 tok/s Pro

GPT-5 Medium 32 tok/s Pro

GPT-5 High 33 tok/s Pro

GPT-4o 108 tok/s Pro

Kimi K2 207 tok/s Pro

GPT OSS 120B 471 tok/s Pro

Claude Sonnet 4.5 36 tok/s Pro

2000 character limit reached

Rejection Sampling: Methods & Applications

Updated 18 July 2025

Rejection sampling is a Monte Carlo technique that transforms sampling from a tractable proposal distribution into exact samples from a target pdf.
It employs an envelope condition (f(x) ≤ Cg(x)) and has evolved with adaptive variants like ARS and CARS to improve efficiency in Bayesian and generative modeling.
Extensions into quantum algorithms, diffusion models, and combinatorial methods showcase its versatility in addressing diverse computational challenges.

Rejection sampling is a foundational Monte Carlo method for generating independent, exactly distributed samples from a target probability density function (pdf) using a proposal (or envelope) distribution. Introduced formally by von Neumann in 1951, the procedure is widely used as a basic building block in Bayesian inference, rare event simulation, generative modeling, state-space estimation, and in quantum and classical algorithms across computational sciences.

1. Fundamental Principles and Classical Methodology

Rejection sampling aims to transform the ability to sample from a tractable proposal distribution $g(x)$ into the ability to sample from a generally intractable target distribution $f(x)$ . Given $f$ and $g$ satisfying $f(x) \leq Cg(x)$ for some constant $C$ and all $x$ in the support, the procedure is as follows:

Draw $x \sim g(x)$ and $u \sim \text{Uniform}(0,1)$ .
Accept $x$ if $u \leq f(x)/(C g(x))$ ; otherwise, reject and try again.

Each accepted $x$ is distributed according to $f(x)$ (assuming $C$ satisfies the envelope condition). Tightness of $C$ —that is, how closely $Cg(x)$ bounds $f(x)$ —controls the acceptance rate and thereby the efficiency. The acceptance probability is $Z_f/(C Z_g)$ , where $Z_f$ and $Z_g$ are the normalizing constants of $f$ and $g$ , respectively.

Rejection sampling serves as a prototype for perfect simulation and underlies a vast array of accept–reject–correct algorithms in statistics, physics, and machine learning.

2. Advances in Adaptive and Parsimonious Rejection Sampling

Adaptive rejection sampling (ARS) and its variants dynamically refine the proposal $g(x)$ based on previously rejected samples, allowing the proposal to approach $f(x)$ ever more tightly. Classic ARS constructs a piecewise exponential envelope via tangent lines to the log-concave target pdf, achieving acceptance rates close to unity but at a computational cost that increases with each added support point (Martino et al., 2015). To address scalability, Cheap Adaptive Rejection Sampling (CARS) fixes the number of support points, employing a swap mechanism to maintain a bound while ensuring constant per-sample cost, particularly useful for large-scale simulation (Martino et al., 2015). Parsimonious ARS (PARS) further refines this by introducing a selective update policy, only adding points when the current proposal significantly underestimates the target and thus holding down envelope complexity without sacrificing acceptance efficiency (Martino, 2017).

Beyond log-concave targets, adaptive schemes using “reduced potentials” or adaptive ratio-of-uniforms can accommodate multimodal or log-convex-tailed distributions by partitioning the domain and adaptively enhancing local proposals (Martino et al., 2011). These methods are especially effective in Bayesian estimation and filtering settings where the tails or the modality structure of the posterior is highly challenging.

Method	Applicability	Computational Cost Control
ARS	Univariate, log-concave	Complexity grows with rejections
CARS	Univariate, log-concave	Constant (user-chosen nodes)
Parsimonious ARS	Log-concave, scalable	Threshold-based complexity
Adaptive RoU	Log-convex, multimodal	Iterative geometric refinement

3. Rejection Sampling in Modern Generative and Bayesian Inference

In generative models and Bayesian inference, rejection sampling provides mechanisms for improving sample quality, enabling resource-efficient post-processing, and achieving exact (or near-exact) marginal draws.

In modern generative adversarial networks (GANs), Discriminator Rejection Sampling (DRS) post-processes generator outputs using the trained discriminator to approximate a density ratio $p_d(x)/p_g(x)$ , accepting samples with probability proportional to this ratio normalized by its maximum. This approach has markedly improved sample fidelity and diversity, as evidenced by substantial improvements in Inception Score and Fréchet Inception Distance when applied to large-scale image datasets (Azadi et al., 2018). The Optimal Budgeted Rejection Sampling (OBRS) framework generalizes this concept by optimally choosing acceptance probabilities to minimize any $f$ -divergence, under a fixed sample budget, with theoretical universal optimality for all $f$ -divergences, including Rènyi divergences (Verine et al., 2023). Importantly, OBRS can be integrated into end-to-end model training, guiding generators to be mass-covering over those regions likely to be accepted post-rejection.

For Bayesian inference, especially in scenarios with streaming data or embedded resource constraints, rejection filtering combines acceptance–rejection and moment tracking to reduce both memory and computational costs (Wiebe et al., 2015). By updating only summary statistics of the posterior, it enables near-real-time adaptation and is applicable in operational settings such as real-time object tracking or adaptive experiment design.

Multilevel rejection sampling (MLMC-ABC) accelerates classic Approximate Bayesian Computation by employing a telescoping sum over successively finer acceptance thresholds, using optimal allocation of computational effort across levels to guarantee i.i.d. sampling properties while controlling variance (Warne et al., 2017).

Rejection-based methods are also key in high-dimensional posterior state simulation. Ensemble Rejection Sampling (ERS) circumvents the exponential decay of acceptance rates in long state sequences by crafting ensemble-based proposals and forward-backward dynamic programming, with expected cost scaling only cubically with sequence length under regularity (Deligiannidis et al., 2020).

4. Extensions and Specializations: Quantum, Diffusion, and Autodifferentiable Rejection Sampling

The principle of rejection sampling extends beyond classical data.

Quantum Rejection Sampling

Quantum rejection sampling adapts the concept to transform an initial superposition into a target via amplitude amplification, with the cost (query complexity) governed by a semidefinite program involving a “water-filling” vector of amplitudes (Ozols et al., 2011). This primitive appears in algorithms for linear systems (HHL), quantum Metropolis sampling, and hidden shift problems, underlining its centrality in constructing quantum algorithms.

Diffusion Models

Diffusion Rejection Sampling (DiffRS) improves sample quality in modern diffusion-based generative models by embedding rejection checks at each reverse transition step, using discriminator networks to estimate likelihood ratios and adaptively refine or resample at each timestep (Na et al., 28 May 2024). This per-timestep correction aligns transition paths more tightly with the true data-generating process, yielding empirically state-of-the-art performance on large-scale benchmarks.

Autodifferentiable Rejection Sampling

Rejection Sampling with Autodifferentiation (RSA) enables differentiable parameter inference in simulation-based models, smoothing out binary accept–reject decisions with gradient-propagatable weights based on likelihood ratios between base and alternate parameterizations (Heller et al., 4 Nov 2024). This facilitates gradient-based model tuning without the need for repeated re-simulation, and enables the integration of complex, ML-derived observables as part of the loss function in parameter fitting workflows, exemplified in hadronization model optimization.

5. Specialized Proposals, Efficient Construction, and Theoretical Results

Beyond envelope selection, efficient rejection sampling often requires careful proposal construction or domain partitioning:

The Vertical Weighted Strips (VWS) framework handles targets of the form $f(x) = w(x)g(x)$ by partitioning the support and majorizing $w(x)$ in each region. This can readily yield a finite mixture proposal amenable to inverse-CDF sampling, with explicit upper bounds on the rejection probability that guide region refinement (Raim et al., 18 Jan 2024).
The Greedy Poisson Rejection Sampler (GPRS) achieves optimal runtime for one-dimensional cases where the likelihood ratio is unimodal, translating the accept–reject criterion into a greedy search over a Poisson process (Flamich, 2023).

Theoretical advances include provably near-optimal adaptive methods: for instance, Nearest Neighbor Adaptive Rejection Sampling (NNARS) achieves minimax-near-optimal rates (up to logarithmic factors) for s-Hölder densities, with the average loss bounded by $O(\log^2(n) n^{1-s/d})$ (Achdou et al., 2018).

6. Rejection Rate Minimization and MCMC Efficiency

In Markov chain Monte Carlo (MCMC), the rejection rate is a direct contributor to autocorrelation time and sampling efficiency. By introducing a one-parameter rejection control transition kernel, one can continuously reduce the rejection probability (e.g., via "tower-shift" mechanisms), yielding exponential improvements in integrated autocorrelation time in sequential update regimes and power-law in random update regimes, independent of the detailed kernel mechanics (Suwa, 2022). This provides a robust guiding principle for discrete-variable MCMC: minimizing rejection is paramount for optimal sampler efficiency.

7. Partial and Conditional Rejection: Algorithmic and Combinatorial Sampling

Partial Rejection Sampling (PRS) adapts rejection to combinatorial problems by only resampling variable subsets involved in unsatisfied constraints, rather than the entire configuration (Jerrum, 2021). This localized strategy, closely related to the resampling-based Lovász Local Lemma by Moser and Tardos, leads to efficiency gains by exploiting problem structure and dependency graphs. For extremal or quasi-extremal instances, PRS ensures perfect sampling from the conditional product distribution, with applications in uniform sampling of sink-free orientations, spanning trees (cycle popping), root-connected subgraphs, and independent sets. PRS is notable for converting approximate Markov chain strategies into perfect samplers for conditioned distributions in combinatorial spaces.

Rejection sampling, in its classical, adaptive, budgeted, partial, and domain-specialized forms, remains a central theme in computational statistics, generative modeling, Bayesian inference, MCMC, and quantum algorithms. Innovations continue to broaden its applicability, automate its parameterization, and deepen its theoretical understanding, cementing its role as a versatile and robust tool in both theoretical research and practical implementation.