Papers
Topics
Authors
Recent
Search
2000 character limit reached

Pliable rejection sampling

Published 24 Apr 2026 in stat.ML and cs.LG | (2604.22385v1)

Abstract: Rejection sampling is a technique for sampling from difficult distributions. However, its use is limited due to a high rejection rate. Common adaptive rejection sampling methods either work only for very specific distributions or without performance guarantees. In this paper, we present pliable rejection sampling (PRS), a new approach to rejection sampling, where we learn the sampling proposal using a kernel estimator. Since our method builds on rejection sampling, the samples obtained are with high probability i.i.d. and distributed according to f. Moreover, PRS comes with a guarantee on the number of accepted samples.

Summary

  • The paper introduces a novel rejection sampling method that builds adaptive proposals using nonparametric kernel density estimation.
  • It provides high-probability performance guarantees to yield i.i.d. samples with improved rejection rates for multimodal, non-log-concave densities.
  • It enhances scalability in moderate dimensions by incorporating optimization-based localization strategies to focus on high-density regions.

Pliable Rejection Sampling: A High-Probability Guarantee for Adaptive Proposal Learning

Introduction

The paper "Pliable rejection sampling" (2604.22385) introduces a novel sampling methodology that addresses the inefficiencies of standard rejection sampling (SRS) in scenarios where the proposal distribution is poorly matched to the target density ff. By leveraging nonparametric kernel density estimation, pliable rejection sampling (PRS) produces an adaptive proposal that approximates the target and provides explicit high-probability performance guarantees on sample yield per budgeted evaluation of ff. This approach avoids the limiting assumptions required by classical and modern adaptive rejection sampling methods and yields i.i.d. samples with high efficiency.

Context: Limitations of Existing Methods

Classical SRS relies on hand-crafted proposals gg and envelope constants MM such that Mg≥fMg \geq f, resulting in prohibitive rejection rates when ff is complex or multimodal. Adaptive rejection sampling (ARS) methods (e.g., [Gilks et al., 1992]) efficiently adapt proposals but are limited to log-concave targets. Extensions such as Adaptive Rejection Metropolis Sampling (ARMS) and A* sampling relax these constraints but sacrifice i.i.d. sampling or require non-trivial problem structure (e.g., a Gumbel-Max decomposition).

Moreover, existing adaptive rejection samplers either impose strong structural assumptions on ff (e.g., log-concavity) or lack finite-sample performance guarantees, and often require iterative refinement that adds to computational overhead.

Methodology

PRS builds the proposal function in an adaptive, one-shot manner using kernel regression on NN uniformly sampled evaluations of ff over the domain [0,A]d[0,A]^d. Key innovations include:

  • Kernel Estimation for Proposal Construction: A kernel estimator ff0 is constructed from ff1-evaluations at ff2 uniformly sampled points, with bandwidth selection governed by smoothness parameter ff3 via ff4.
  • Uniform Error Control: The estimator is endowed with a high-probability ff5 error guarantee uniform over ff6, based on modern empirical process results for kernel estimators, as detailed in Theorem 1.
  • Pliable Proposal Formation: The final proposal ff7 is a mixture of the kernel density estimate and a uniform base, scaled by an explicit additive error term to guarantee ff8 everywhere. Critically, the normalizing constant and rejection threshold are explicitly data-driven.

This construction is nonparametric and only assumes that ff9 is bounded and locally Hölder-smooth (with gg0). This encompasses a much broader function class than log-concavity, covering densities in Besov balls.

Algorithmic Performance and Guarantees

The principal theoretical result (Theorem 2) proves that, for a budget of gg1 evaluations of gg2, the expected number of i.i.d. samples produced by PRS is at least

gg3

with probability at least gg4.

This rejection rate, decaying polynomially in gg5 with exponent gg6, is provably negligible for large budgets and mild smoothness gg7, contrasting sharply with SRS, whose acceptance is determined by the often intractable global sup-norm bound on gg8. In empirical comparison to SRS and A* sampling, PRS consistently provides higher or competitive acceptance rates, especially when only black-box access to gg9 is given, and does not require the structural or decomposition information that A* demands.

In contrast to MCMC and ARMS methods, PRS yields i.i.d. samples up to a quantifiable approximation error, and accommodates non-log-concave, multimodal, and unnormalized densities.

Extensions and High-Dimensional Regimes

The paper addresses the high-dimensional degradation common to all rejection sampling schemes. Uniform initial sampling becomes exponentially inefficient in MM0, but PRS proposes recourse to optimization-based localization strategies:

  • If the bulk of the mass of MM1 is concentrated on a small region, and MM2 (or a transformation MM3) is convex outside a high-density support, first-stage random initialization followed by gradient-based (or Hessian-accelerated) optimization locates regions of non-negligible mass in MM4 or MM5 steps.
  • This localization enables construction of the kernel estimator in an informative region, improving scalability in moderate dimensions.

An extension for unbounded-support densities is discussed: by truncating the sampling domain adaptively or applying two-stage rejection with sub-Gaussian proposals, one can obtain bounds that scale as MM6 in MM7.

Numerical Results

Comprehensive empirical results validate PRS:

  • On a synthetic "peaky" unimodal target, PRS achieves superior acceptance rates to both SRS and A* sampling, especially as budget increases and for moderate peakiness.
  • On two-dimensional multimodal targets, PRS acceptance rates approach those of A* and significantly exceed SRS.
  • On the challenging "clutter" posterior, PRS consistently outperforms SRS and tracks A* (which is given additional structural information).
  • In all cases, the number of i.i.d. samples per MM8 evaluation is close to optimal, as predicted by theory.

Implications and Future Directions

The PRS framework provides a new paradigm for rejection sampling in settings where the target density is smooth, potentially multimodal, unnormalized, or lacking restrictive structure. Strong performance guarantees and high empirical efficiency support its deployment in Bayesian inference, probabilistic modeling, and any domain where sample i.i.d.-ness is critical and function evaluations are expensive.

While PRS is fundamentally limited by the curse of dimensionality, the proposed localization heuristics and extension to convex-transformed densities offer promising directions. Future work could integrate iterative proposal refinement, online kernel bandwidth adaptation, or exploit structure in MM9 (e.g., via score estimation or Stein operators) to further reduce rejection rates and scale towards higher Mg≥fMg \geq f0.

Conclusion

Pliable rejection sampling merges kernel-based density estimation with rejection sampling to yield an adaptive, nonparametric proposal mechanism with explicit high-probability efficiency guarantees. This approach removes much of the ad hoc or restrictive nature of previous adaptive samplers, provides i.i.d. sampling with minimal budget wastage, and is broadly applicable in settings of black-box smooth target densities. Remaining challenges include high-dimensional scalability and online adaptation, which present fertile directions for continued research.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.