Rejection Sampling: Principles and Adaptations
- Rejection sampling is a stochastic method that generates exact samples from complex target distributions using a simpler proposal density and an acceptance probability.
- Adaptive variants like ARS, ARMS, and NNARS refine the sampling envelope to improve acceptance rates and efficiency, especially in high-dimensional or multimodal settings.
- Advanced techniques incorporate quantum extensions, privacy-aware modifications, and autodifferentiable frameworks, broadening applications in simulation, inference, and machine learning.
Rejection sampling algorithms are a foundational class of stochastic simulation techniques designed to generate samples from a complex target distribution by leveraging an easier-to-sample proposal. These algorithms provide exact sampling under broad settings, form the basis for advanced adaptive and Markov chain algorithms, and have been deeply analyzed and extended in both classical and quantum contexts. The following sections delineate the central principles, algorithmic advances, theoretical insights, application domains, and modern adaptations of rejection sampling.
1. Core Principles and Basic Algorithm
At its core, rejection sampling proceeds as follows: given a target probability density (often known only up to normalization) and a proposal density such that for all and some , the algorithm samples and accepts it with probability ; otherwise, the sample is rejected and the process repeats. The accepted samples are distributed exactly according to .
The acceptance rate is $1/M$, and the efficiency of the algorithm is dictated by the tightness of the bound . In high dimensions or when is highly variable, the acceptance probability can be exceedingly small, which motivates adaptive strategies and variants to improve practicality.
2. Adaptive and Advanced Algorithmic Variants
Adaptive Rejection Sampling (ARS) and Extensions
Standard rejection sampling requires manual selection of and a tight constant . Adaptive rejection sampling (ARS) algorithms iteratively build a piecewise proposal—typically for univariate log-concave distributions—by maintaining support points and constructing tangents to . Each rejection augments the support set, refining the envelope and increasing the acceptance rate. Variants such as Cheap ARS (CARS) fix the number of nodes to cap the complexity per sample, improving sampling throughput for large (Martino et al., 2015).
Parsimonious ARS (PARS) adds support points only when the envelope is insufficiently tight, carefully controlling complexity while maintaining high acceptance (Martino, 2017). Extensions to non-log-concave targets have been developed: one via decomposing the potential into convex parts and constructing mixture proposals, another based on the ratio-of-uniforms (RoU) method, which can handle log-convex tails and multimodality by transforming the sampling problem into a two-dimensional bounded region (Martino et al., 2011). These adaptive techniques enable exact sampling for broader classes of univariate distributions, with acceptance rates that approach unity as the envelope adapts.
Adaptive Rejection Metropolis Sampling (ARMS)
ARMS augments ARS to yield Markov chains for non-log-concave or multimodal targets: the rejection step is followed by a Metropolis–Hastings accept–reject test. One limitation is that no support points are added in regions where the proposal lies below the target, potentially stalling adaptation. Improved variants (A²RMS, IA²RMS) address this by allowing controlled addition of support points in these regions, satisfying “diminishing adaptation” conditions to ensure Markov chain convergence (Martino et al., 2012).
Minimax and Nonparametric Adaptive Sampling
The Nearest Neighbor Adaptive Rejection Sampling (NNARS) algorithm (Achdou et al., 2018) provides provable minimax near-optimal rejection rates for -Hölder smooth target densities in dimensions. NNARS empirically estimates the target via nearest neighbors, pads with a rigorously chosen confidence radius, and adaptively constructs envelope proposals, guaranteeing i.i.d. outputs with a rejection rate matching the theoretical lower bound up to logarithmic factors.
3. Extensions to High-Dimensional, Structured, and Quantum Domains
High-Dimensional and Structured Settings
Rejection sampling degrades exponentially with dimension when applied naively. Ensemble Rejection Sampling (ERS) mitigates this by using ensembles of states to construct an empirical proposal on an extended space and leverages dynamic programming (e.g., forward–backward recursions) for efficient exact sampling in high-dimensional state-space models. Under regularity conditions, the expected cost scales cubically with the sequence length, improving dramatically over the exponential scaling of classical methods (Deligiannidis et al., 2020).
The OS* (Optimization-Sampling*) algorithm combines adaptive rejection sampling with A* optimization search: it incrementally refines the proposal envelope only where rejections are observed, using functional norms to optimize acceptance rates (for sampling) or obtain tight upper bounds (for optimization). Applications include exact inference in high-order HMMs and large discrete graphical models (Dymetman et al., 2012).
Partial and Coupled Rejection Techniques
Partial Rejection Sampling (PRS) is tailored for constraint satisfaction problems expressed as CNF formulas. Unlike classical rejection sampling, which resamples the entire variable set, PRS only resamples “local” variables (those involved in currently violated clauses). The algorithm terminates in expected iterations proportional to the probability of exactly one clause violation over the feasibility probability, often polynomial even when the feasible region is rare. PRS is closely linked to the Moser–Tardos Lovász Local Lemma method and guarantees perfect conditional sampling for extremal instances (Jerrum, 2021).
Coupled rejection samplers extend the paradigm to generate coupled outputs with correct marginals and controlled meeting times, important for unbiased MCMC and parallel resampling. For generic distributions, explicitly designed dominating couplings and ensemble variants provide positive lower bounds on coupling probability and bounded execution time variance, with closed-form optimization in the Gaussian case (Corenflos et al., 2022).
Quantum Rejection Sampling
Quantum rejection sampling generalizes classical resampling to the quantum setting where the initial state is a superposition with hidden components. The algorithm uses controlled rotations and quantum amplitude amplification to coherently adjust the amplitudes of basis states, producing a target superposition with optimal query complexity characterized by a semidefinite program. Application areas include quantum linear system solvers, quantum Metropolis sampling, and the hidden shift problem (Ozols et al., 2011).
4. Shape-Constrained and Structure-Exploiting Sampling
Exploiting structural constraints on the target—such as monotonicity, unimodality, “cliff-like”, or log-concavity—enables sublinear envelope construction in discrete settings. For instance, monotone distributions over can be bounded using queries, and log-concave or “cliff-like” structures allow query complexity. These efficiency gains directly impact bandit algorithms (e.g., Exp3), decreasing per-iteration cost from to via efficient envelope construction and order-statistics data structures while preserving theoretical regret guarantees (Chewi et al., 2021).
5. Algorithmic Innovations in Specialized Sampling Tasks
In particle simulation for Maxwell–Jüttner distributions, analytic approximations for e-folding points and a linear left-tail envelope eliminate numerical root finding and improve rejection efficiency—achieving acceptance rates exceeding 90% while maintaining algorithmic self-containment even for cellwise-varying temperatures (Zenitani, 17 Aug 2024).
For one-shot channel simulation and neural data compression, Greedy Poisson Rejection Sampling (GPRS) exploits the unimodality of the density ratio and provides coding schemes whose average codelength approaches the mutual information plus a small additive term, circumventing the inefficiencies associated with the worst-case Rényi -divergence, which determines the optimal rejection bound in classical sampling (Flamich, 2023).
In Markov chain Monte Carlo contexts, the explicit reduction of rejection rates using a one-parameter control kernel yields exponential improvement in autocorrelation times, as empirically established for sequential spin updates in Potts models (Suwa, 2022).
6. Modern Algorithmic Enhancements and Applications
“Easy Rejection Sampling” algorithms introduce automatic, gradient-based refinement of parameterized proposal distributions (e.g., GMMs), eliminating hand-coded envelopes. The objective function directly targets the empirical supremum of , optimizing proposal parameters for maximal acceptance rate using autodifferentiation routines. This enables black-box, high-acceptance sampling for differentiable targets in low-dimensional scenarios, with runtime acceleration on GPU/TPU architectures (Raff et al., 2023). Closely related, autodifferentiable rejection sampling (RSA) enables parameter inference and model fitting by reweighting both accepted and rejected samples, integrating with gradient-based optimization frameworks and allowing use of unbinned, ML-based loss functions for simulation tuning (Heller et al., 4 Nov 2024).
Partial rejection sampling and data augmentation frameworks further enable augmented joint modeling in doubly intractable likelihoods, restoring tractability by recycling rejected proposals and facilitating efficient posterior inference (e.g., for flow cytometry under truncation or nonparametric GP densities) (Rao et al., 2014).
7. Privacy, Differential Privacy, and Side-Channel Considerations
Randomized sampling algorithms can leak private information via timing side-channels—since rejection sampling leads to a geometric runtime distribution whose parameter may depend on sensitive underlying data. This “runtime leakage” can be quantified precisely in Rényi differential privacy terms, where divergence between geometric distributions with different acceptance probabilities yields explicit privacy costs. Unless acceptance probabilities are data-independent, standard rejection samplers cannot satisfy differential privacy. Algorithmic modifications—including runtime padding or sampling schemes yielding controlled, data-independent runtime—mitigate this leakage at the expense of approximate sampling or bounded extra privacy cost. Similar considerations are extended to adaptive samplers and log-Hölder densities (Awan et al., 2021).
Rejection sampling thus constitutes a flexible and theoretically deep class of methods, spanning exact simulation, adaptive variants, high-dimensional adaptations, quantum extensions, privacy-aware variants, and autodifferentiable frameworks. Its theoretical foundations—especially the role of the optimal envelope, minimax rate guarantees, and the intrinsic link to information divergences such as the Rényi -divergence—continue to inform the development of new techniques for efficient, robust, and privacy-preserving simulation across diverse domains in statistics, physics, machine learning, and computational sciences.