Improved Sampling Algorithms

Updated 4 February 2026

Improved sampling algorithms are refined methods that reduce sample complexity and variance using structured, adaptive, and entropy-efficient techniques.
They incorporate innovations like negative dependence in pivotal sampling and randomness recycling to achieve uniform coverage and lower computational costs.
These approaches enhance performance across domains including regression, Bayesian inference, and quantum state estimation while ensuring strong theoretical guarantees.

An improved sampling algorithm is any algorithmic modification or novel method that achieves strictly better sample efficiency, estimation accuracy, theoretical guarantees, or space/time complexity in stochastic, statistical, or combinatorial settings compared to standard or prior-art algorithms. Across computational statistics, combinatorial optimization, and high-dimensional inference, improved sampling schemes are foundational for enabling practical inference, fair discrete optimization, rigorous uncertainty quantification, and efficient data summarization. Below, major algorithmic advances are surveyed by domain.

1. Improved (Dependent) Leverage Score Sampling in Active Learning

Marginal leverage score sampling is central for optimal subset selection in linear regression and function approximation. The standard approach selects rows of a data matrix $X \in \mathbb{R}^{n \times d}$ independently with probability proportional to their leverage scores $\ell_i = x_i^\top (X^\top X)^{-1} x_i$ . However, independent sampling often yields spatially clustered subsamples that provide poor coverage and higher variance in downstream estimates.

The pivotal sampling algorithm introduces a spatially structured, negatively dependent selection procedure. Data points are hierarchically grouped into a balanced binary tree, and a pairwise head-to-head tournament recursively selects exactly $k$ rows with prescribed marginals $\tilde p_i \propto \ell_i$ . Critically, this method preserves marginal inclusion probabilities and introduces negative dependence: points "close" in feature space (i.e., falling in the same subtree) compete early and are unlikely to be chosen together. This promotes uniform geometric coverage.

Theoretical advances are enabled by the one-sided $\ell_\infty$ -independence condition, a measure of non-independence that admit extensions of matrix Chernoff concentration inequalities to pivotal or similar schemes. For agnostic linear regression, pivotal sampling requires $O(d \log d + d/\epsilon)$ samples to guarantee $(1+\epsilon)$ multiplicative accuracy in mean squared error—matching the best independent bounds—while reducing constants and empirical sample sizes by up to $50\%$ . In polynomial regression, the sample complexity drops to $O(d/\epsilon)$ , removing logarithmic factors due to improved subspace covering. Empirical validation on parametric PDE and chemical systems demonstrates spatial uniformity and reduced sample budgets relative to independent leverage sampling (Shimizu et al., 2023).

2. Adaptive and Exact Sampling for Nontrivial Target Distributions

Several improved algorithms offer significant advances in specific probabilistic sampling contexts:

a. Exact Normal Sampling

The algorithmic design by Du–Fan–Wei improves on Karney’s exact normal sampler by introducing a more efficient discrete-Gaussian sampler for the core integer part and factoring acceptance tests for the uniform component. This reduces the expected number of Uniform(0, 1) deviates needed to sample one standard normal from $\approx 10.15$ to $\approx 8.09$ , a $20\%$ reduction. The improvements extend to both speed and random-variable usage while retaining strict exactness; this is critical for randomized algorithms deployed in high-throughput or cryptographically secure contexts (Du et al., 2020).

b. Randomness Recycling

"Randomness recycling" systematically recovers unused bits after each discrete sample by maintaining a global state $(Z, M)$ representing a uniform variable over an expanding range. Core split-and-merge primitives enable reusing entropy for future samples, reducing the per-sample entropy cost to $H(X_1,\ldots,X_k)/k + \varepsilon$ for any sequence of discrete random variables, approaching the information-theoretic lower bound set by Shannon entropy. This technique enables entropy-optimal online sampling with $O(\log(1/\varepsilon))$ space and improves over classical algorithms in both theoretical and empirical efficiency, especially when entropy sources are slow or costly (Draper et al., 24 May 2025).

c. Adaptive Rejection Metropolis (ARMS) Improvements

Standard ARMS can fail to adapt proposal envelopes in regions where the proposal underestimates the target, causing poor mixing and suboptimal acceptance. Two enhancements—A²RMS and IA²RMS—guarantee that support points are eventually added in regions where proposal density is insufficient. These methods introduce auxiliary rejection steps so that the proposal envelope converges to the true target density globally, not just where initial samples land. This corrects a structural bias, reduces autocorrelation, and yields orders-of-magnitude improved mixing in both mean and $L^1$ -distance metrics while incurring negligible computational overhead (Martino et al., 2012).

3. High-Dimensional and Nonlogconcave Proximal Sampling

For high-dimensional Bayesian inference or convex optimization, sampling from $\nu(dx) \propto \exp(-f(x))dx$ swiftly becomes intractable if $d$ is large or $f$ lacks strong log-concavity. Classic methods such as MALA or Langevin Monte Carlo scale at best as $O(d)$ in dimension for the number of required gradient or proximal operations.

An improved proximal sampler based on inexact (approximate-rejection) Restricted Gaussian Oracles achieves state-of-the-art complexity $O(\kappa d^{1/2})$ for strongly log-concave targets, matching the lower bound for MALA and outperforming prior proximal Gibbs analyses that only reached $O(\kappa d)$ (Fan et al., 2023). The improvement is enabled by a sharp Gaussian concentration inequality for semi-smooth functions, which allows tuning the Gaussian step size $\eta$ as $O(d^{-1/2})$ rather than $O(d^{-1})$ in the rejection sampling step, so the overall sample cost per update drops substantially.

The algorithm generalizes to convex, LSI-, and PI-satisfying distributions (including non-logconcave cases), achieving significant improvements in sample complexity. The underlying mechanisms, extending from functional inequalities to entropy-regularized Wasserstein proximal schemes, are given rigorous convergence proofs, and all guarantees remain robust to approximate or composite potentials (Chen et al., 2022).

4. Combinatorial and Structural Sampling Enhancements

Several combinatorial settings also admit improved sampling strategies:

Graph colorings: A new bounding-chain based CFTP achieves perfect sampling of $k$ -colorings for $k > 3\Delta$ (linear in maximum degree, vs quadratic in prior art), with explicit control over coupling and drift to singleton coloring. This is the best known color threshold for perfect uniform sampling and fundamentally advances the application of CFTP in combinatorics (Bhandari et al., 2019).
Lattice gauge theory: Modified Hamiltonian Monte Carlo steps ("winding HMC") that explicitly induce transitions between topological sectors engineer polynomial, rather than exponential, scaling of autocorrelation times in the sampled charge, breaking the "freezing" barrier of standard HMC in continuum limits while preserving ergodicity (Albandea et al., 2021).
Fair enumeration: Improved enumeration algorithms using refined coupon-collector bounds avoid wasted samples at each checkpoint (unlike previous schemes) and achieve strictly lower expected sampler calls, with rigorous failure guarantees, for enumerating finite sets by uniform sampling. This approach is relevant for benchmarking, cryptographic auditing, and quantum annealing (Mizuno et al., 2021).

5. Application-Specific, Domain-Driven Sampling Advances

Surface-hopping in nonadiabatic molecular dynamics: Introducing a birth-death branching process to the frozen Gaussian approximation with surface hopping (FGA-SH) reduces variance in MC estimates by adaptively pruning/replicating trajectories according to their importance weights, thus markedly improving estimator accuracy for the same computational cost (Lu et al., 2016).
Quantum stabilizer state estimation: Bell difference sampling combines symplectic Fourier analysis and commutation graph theory to yield polynomial-time algorithms for stabilizer testing and estimation in certain fidelity regimes, providing optimal sample complexity and tolerant testers for quantum information tasks (Grewal et al., 2023).
Online parameter estimation in Bayesian inference: Dynamic nested sampling outperforms classic nested sampling by adaptively increasing the number of "live points" in regions where accurate evidence or posterior estimation matters most. This optimizes sample allocation across the evidence and posterior, yielding speedups up to $72\times$ for parameter estimation (Higson et al., 2017).

6. Empirical and Theoretical Impact

A consistent theme across all these improved algorithms is that leveraging problem-specific structure—negative dependence, entropy recycling, adaptive proposal envelopes, and concentration inequalities—yields sharp reductions in sample complexity, variance, or computational cost while maintaining exactness or tight approximation guarantees. These algorithms often match or approach fundamental lower bounds in their task domain. They have been empirically validated to yield substantial gains in practical settings, including high-dimensional Bayesian inference, combinatorial enumeration, and fair solution sampling across quantum and classical heuristics.

7. Future Directions and Open Problems

Extending dependent sampling analyses (matrix Chernoff, negatively dependent designs) to broader classes such as determinantal point processes and volume sampling.
Generalizing entropy-recycled online sampling to sequential, adaptive dependency structures beyond conditionally independent draws.
Closing the gap for perfect graph coloring (lowering $k > 3\Delta$ to $k > 2\Delta$ ).
Enabling winding HMC or other nonlocal updates to efficiently operate in $4$d non-Abelian gauge theories.
Deriving unified non-asymptotic regret and sample complexity bounds for Bayesian bandit and exploration-driven methods in deep, model-based RL or GFlowNet settings.

Improved sampling algorithms are thus essential theoretical and practical tools for modern data analysis, scientific computing, and machine learning, continuously refined as new quantitative and structural insights emerge.