Nearest-Neighbor Adaptive Rejection Sampling
- Nearest-Neighbor Adaptive Rejection Sampling (NNARS) is an adaptive Monte Carlo algorithm that samples unknown Hölder-smooth densities on [0,1]^d using grid-based piecewise-constant estimators.
- It iteratively refines its proposal envelope based solely on observed density values, ensuring a near-optimal rejection rate under logarithmic factors.
- Empirical and theoretical analyses demonstrate NNARS's efficiency in moderate dimensions, though its grid-based approach faces exponential complexity as dimensionality increases.
Nearest-Neighbor Adaptive Rejection Sampling (NNARS) is an adaptive Monte Carlo sampling algorithm designed for efficiently drawing independent samples from densities on that can be evaluated at any point but are otherwise unknown and potentially expensive to compute. NNARS achieves a minimax near-optimal rejection rate within logarithmic factors under Hölder smoothness assumptions on . It advances the adaptive rejection sampling (ARS) literature by providing both tight theoretical guarantees and a practical, grid-based piecewise-constant proposal mechanism based on approximate nearest-neighbor estimation (Achdou et al., 2018).
1. Problem Setting and Motivation
The task is to sample from an unknown density on , assumed to satisfy and a Hölder condition for and constant . Evaluating is computationally costly. Standard rejection sampling draws from a proposal 0 and accepts 1 with probability 2, requiring a tight envelope 3 everywhere to be efficient; otherwise, the acceptance rate can be prohibitively low. NNARS aims to automatically construct and refine such envelopes adaptively, based solely on observed 4 values, without strong parametric assumptions or requiring tractable decompositions.
2. Algorithmic Structure and Envelope Construction
NNARS proceeds in 5 rounds. In each round 6:
- A piecewise-constant proposal density 7 over 8 and associated rejection constant 9 define the envelope 0.
- 1 candidate points are drawn using standard rejection sampling (RSS) with 2. Accepted points are retained, and all proposals with their 3-values are collected into the data set 4.
- The algorithm constructs a new histogram estimator 5 using an approximate nearest-neighbor rule: 6 is partitioned by an 7-grid with 8-dependent resolution. For each 9, its nearest grid cell center 0 is found, and the closest 1-evaluated sample to that center sets 2.
- A Hölder-based estimation error bound 3 is computed to ensure 4. The new envelope is defined as 5, so 6 everywhere. The proposal is then 7 and 8.
| Step | Key Operation | Section Reference |
|---|---|---|
| Envelope update | Grid-based nearest-neighbor + error margin | Envelope Construction |
| Proposal sampling | Piecewise-constant on adaptive grid | Implementation Details |
| Acceptance check | 9 with 0 | Algorithm Description |
The procedure iteratively refines the envelope as more 1-evaluations are acquired, shrinking the rejection constant and focusing sampling where 2 is large.
3. Theoretical Guarantees and Minimax Optimality
The formal minimax risk is
3
where 4 is the number of density evaluations used minus the number of accepted samples, given algorithm 5.
- Lower bound (Theorem 4.1): For 6 large enough,
7
for some constant 8.
- Upper bound for NNARS (Theorem 3.1): For 9,
0
with 1 depending polynomially on 2, and 3.
Thus, NNARS is minimax-near-optimal for Hölder densities, matching the lower bound up to 4 factors. This theoretical regime covers general multivariate densities without strong structural assumptions (Achdou et al., 2018).
4. Implementation Details and Computational Complexity
The nearest-neighbor histogram is implemented via a cubic grid of side length 5. Each cell center in the grid stores the index of the closest sampled point. As new samples are added, updates affect only adjacent cells.
Sampling from 6 consists of:
- Choosing a cell 7 with probability proportional to 8 (per-cell value times volume).
- Sampling a point uniformly within that cell.
- Memory: 9 for storing all sample points and their grid associations.
- Envelope update per round: 0, reducible with spatial hashing; total 1.
- Sampling a proposal: 2 for cell choice, 3 for within-cell draw.
- Overall runtime: 4 (Achdou et al., 2018).
5. Empirical Comparisons and Sensitivity
Experiments benchmark NNARS against Pure Rejection Sampling (PRS), OS*/A* samplers, and Simple RS:
- For uni- and multi-modal densities with sharp peaks (e.g., 5), NNARS achieves acceptance rates comparable to the best algorithms when no special structure is known; as 6 increases, performance degrades in line with PRS.
- On high-dimensional product sine densities, NNARS outperforms all baselines for 7 to 8; OS*/A* samplers fail when decomposition structure is not available.
- For 9, acceptance rate approaches the asymptotic 0 regime after approximately 1K density evaluations, with variance stabilizing at similar sample sizes.
- On two-dimensional real forest fire data, NNARS achieves acceptance rates near 2, significantly above PRS (3) and Simple RS (4) (Achdou et al., 2018).
These results demonstrate the robustness of NNARS, particularly in moderate to high dimensions under mild regularity.
6. Limitations, Parameter Choices, and Future Directions
NNARS requires specifying Hölder parameters 5 and lower bound 6 on 7. Setting 8 conservatively (below the true exponent) merely increases the envelope margin and worsens rates logarithmically. 9 can be chosen as small as 0 without affecting guarantees.
The grid-based piecewise-constant proposal incurs exponential complexity in 1, so practical feasibility is limited to 2. Data structures such as kd-trees or spatial hashing can extend applicability to higher dimensions. A plausible implication is that further algorithmic advances are necessary for scaling NNARS to very high-dimensional problems.
Open research directions include:
- Extending the approach to densities that vanish (dropping the uniform lower bound 3).
- Adapting to higher-order smoothness 4 via polynomial envelopes instead of piecewise constant.
- Efficient nearest-neighbor updates in high dimension.
- Data-driven or adaptive estimation for 5 and grid resolution, removing reliance on prior knowledge of 6 (Achdou et al., 2018).