Bouncy Particle Sampler (BPS)

Updated 16 September 2025

Bouncy Particle Sampler (BPS) is a continuous-time, non-reversible MCMC method that uses deterministic motion and bounce events to sample complex, high-dimensional distributions.
Its simulation combines analytical and adaptive techniques, such as time-scale transformation and adaptive thinning, to efficiently compute event times.
Extensions including stochastic, generalized, and discrete variants enhance its applicability to large datasets, complex models, and structured state spaces.

The Bouncy Particle Sampler (BPS) is a continuous-time, non-reversible Markov chain Monte Carlo (MCMC) algorithm based on a piecewise deterministic Markov process (PDMP) framework. Unlike classical reversible MCMC methods such as Metropolis–Hastings, BPS leverages deterministic trajectories and event-driven stochastic updates—termed "bounces"—to realize rejection-free, efficient exploration of high-dimensional and structured target distributions. The core mechanism utilizes constant-velocity motion interrupted by velocity reflections dictated by the gradient of the log-target density; ergodicity is ensured via stochastic velocity refreshment. This paradigm has enabled substantial methodological developments, including block updating for state space models, stochastic gradient variants for large-scale Bayesian inference, and rigorous analysis of geometric ergodicity and high-dimensional scaling properties.

1. Core Structure and Algorithmic Dynamics

BPS operates on an extended state space (x, v) ∈ ℝᵈ × ℝᵈ, where x represents position and v a velocity sampled from an auxiliary distribution (typically isotropic Gaussian or uniform on the unit sphere). The target distribution is π(x) ∝ exp(–U(x)), with U(x) differentiable and interpreted as an energy function. The dynamics proceed as follows:

Deterministic Linear Motion: Between events, the trajectory evolves according to

$x(t) = x^{(i)} + v^{(i)} (t - t_i)$

Bounce Events: Occur at random times governed by an inhomogeneous Poisson process with intensity

$\lambda(x, v) = \max\{0, \langle \nabla U(x), v \rangle \}$

At a bounce,

%%%%2%%%%

So only the velocity is updated, preserving the joint invariance of (x, v).

Refreshment Events: At rate $\lambda^{\mathrm{ref}}$ , v is resampled independently, $\psi(v)$ (e.g., N(0, I)). This step addresses reducibility and ensures ergodicity, particularly in degenerate target geometries.

The process is non-reversible, and the continuous-time, rejection-free construction distinguishes it from traditional MCMC—a feature of PDMP samplers (Bouchard-Côté et al., 2015).

2. Simulation and Factorization Strategies

The simulation of BPS demands efficient computation of event times and targeted updates:

Time-Scale Transformation: For log-concave targets, explicitly solves

$\int_0^\tau \lambda(x + v s, v) ds = -\log V \qquad (V \sim \mathrm{Uniform}[0,1])$

for bounce times.

Adaptive Thinning: When analytic integration is infeasible, upper bounds $\bar{\chi}(s)$ on $\lambda$ are constructed on intervals to sample event times from a Poisson process and accept/reject via thinning.
Superposition and Factorization: For factorizable targets $U(x) = \sum_{i} U^{[i]}(x)$ , bounce times are computed for each factor, with events occurring at the minimum. Local BPS variants (Bouchard-Côté et al., 2015, Zhao et al., 2019) update only affected cliques, reducing per-event complexity and enabling parallel subgraph updates. Candidate bounce times are managed via efficient priority queues, and only neighborhood-related candidates are recomputed after an event.

Simulation Technique	Target Structure	Event Scheduling
Time-scale transformation	Log-concave, smooth	Analytical inversion
Adaptive thinning	General differentiable targets	Upper bounds, thinning
Superposition/composition	Factorized (e.g., graphs)	Minimum over local Poisson

3. Variants and Extensions

BPS accommodates numerous extensions supporting modern inference tasks:

Stochastic BPS (SBPS): For large datasets, full gradients are replaced by mini-batch-based noisy gradients $\nabla \tilde{U}$ (Pakman et al., 2016). The resulting process is a doubly stochastic Poisson process. Adaptive regression-based thinning predicts bounce intensities without introducing bias (with zero-mean noise), but a controlled bias-vs-efficiency trade-off can be realized by relaxing the bound.
Generalized BPS (GBPS): Replaces deterministic bounce with a randomized update, resampling the velocity's orthogonal component while flipping the parallel component, thus achieving irreducibility without explicit velocity refreshment (Wu et al., 2017).
Discrete BPS (DBPS): Discrete-time analog employing guided random walks interrupted by reflection and delayed-rejection steps. Avoids global upper bounds and requires only pointwise evaluations of $\pi$ and $\nabla\log\pi$ (Sherlock et al., 2017).
Binary BPS: Extends the method to binaries by augmenting the discrete space {±1}^d with a continuous embedding subject to piecewise differentiable potentials, and handling boundary bounces via Metropolis-like acceptance (Pakman, 2017).
Blocked and Local BPS: Inference in state space models is accelerated by updating spatiotemporal blocks (possibly overlapping) and assigning local event clocks, enabling parallel updates and distributed computation (Goldman et al., 2021, Zhao et al., 2019).
Gibbs-BPS: Merges BPS (for high-dimensional Gaussians) with Gibbs updates for hyperparameters, yielding substantial computational savings in linear inverse problems with global–local shrinkage priors (Ke et al., 12 Sep 2024).

4. Theoretical Properties: Ergodicity and High-Dimensional Scaling

Rigorous analysis provides geometric ergodicity guarantees and high-dimensional scaling results:

Geometric Ergodicity: Verifiable conditions (curvature and tails) on the target density ensure exponential convergence rates of BPS (Deligiannidis et al., 2017, Durmus et al., 2018). A position-dependent refreshment rate ensures ergodicity even for super-Gaussian (thin-tailed) or sub-exponential (thick-tailed) targets via variable transformations.
Scaling Limits: In high dimensions, BPS and related PDMP algorithms exhibit non-standard mixing behavior. For standard Gaussians:
- Angular momentum mixing costs are $O(d)$ per independent sample.
- Negative log-density and coordinate marginals may incur $O(d^2)$ cost per effective sample, but optimized refresh rates (with 78.12% refreshment events) mitigate this (Bierkens et al., 2018).
- For low-dimensional marginals, randomized Hamiltonian Monte Carlo (RHMC) arises as a scaling limit, featuring dimension-free convergence rates under strong log-concavity, established via coupling and hypocoercivity (Deligiannidis et al., 2018).
Infinite Dimensions: BPS generalizes to Hilbert spaces with suitable modifications to the velocity space and reflection operator, enabling well-posedness and proper invariant measures in infinite-dimensional Bayesian inverse problems (Dobson et al., 2022).

5. Practical Implementations and Applications

BPS and its extensions are highly effective for various challenging applications:

Bayesian Inference with Sparse Graphs and Big Data: Factorized local BPS drastically reduces per-bounce computational costs, scaling to large data (e.g., logistic regression, neural network posteriors) (Bouchard-Côté et al., 2015, Pakman et al., 2016).
Phylogenetic Multivariate Probit Models: BPS enables efficient joint sampling of tens of thousands of truncated latent variables, with dynamic programming facilitating linear-time gradient computation (Zhang et al., 2019, Zhang et al., 2022).
Large-Scale Inverse Problems: Gibbs-BPS provides computational tractability in high-dimensional image reconstruction and uncertainty quantification when combined with global–local shrinkage priors (Ke et al., 12 Sep 2024).
Multimodal and Complex Targets: BPS enhanced with parallel tempering in the infinite exchange rate limit (BPS-PT) offers accelerated convergence for multimodal posteriors by facilitating robust transitions across modes (Saito et al., 2 Sep 2025).
Hamiltonian Variants: Bouncy Hamiltonian dynamics unify HMC and PDMP frameworks, offering deterministic bounce-driven proposals with competitive efficiency and minimal tuning (Chin et al., 14 May 2024).

6. Blocked and Parallel BPS in State Space Models

Contemporary state space models with high-dimensional latent structures benefit from spatiotemporal blocked BPS algorithms (Goldman et al., 2021):

Block Partitioning: The latent index set is covered with overlapping rectangular space–time blocks, each maintaining an independent event clock.
Block-Local Bounces: Velocity reflections are restricted to the active block, with auxiliary variables correcting for overlaps by adjusting each coordinate's flow.
Parallelization: Partitioning into non-overlapping (even–odd) blocks allows concurrent velocity updates, and applying a maximum over local rates avoids the linear scaling of event rates with block count.
Empirical Performance: Blocked BPS achieves higher ESS/s and faster mixing than both standard BPS and particle Gibbs, especially as dimension and autocorrelation length of the model grow.

7. Outlook and Current Developments

Advancements include adaptive covariance learning and refreshment rate tuning (Bertazzi et al., 2020), robust handling of non-smooth and thick/very thin-tailed posteriors (Deligiannidis et al., 2017), efficient implementation in infinite dimensions (Dobson et al., 2022), and improved mixing in strongly correlated parameter spaces via joint updates and operator splitting (Zhang et al., 2022). Sequential-proposal frameworks and variants (e.g., for multimodal targets) enable further mixing enhancements (Park et al., 2019). The method continues to be central to scalable, non-reversible Bayesian computation, with ongoing research focusing on higher-level algorithmic design, dimension-robust performance, and applications to complex, structured, and multimodal scientific models.

In summary, the Bouncy Particle Sampler provides a mathematically rigorous and computationally efficient alternative to traditional MCMC, leveraging non-reversible PDMP dynamics, bounce-driven stochasticity, and exploiting local structure for parallel and scalable inference. Algorithmic innovations such as blocked updates, stochastic gradients, adaptive mechanisms, and parallel tempering significantly broaden the scope and applicability of the BPS framework across the modern statistical sciences.