Projected Langevin Monte Carlo (PLMC)
Projected Langevin Monte Carlo (PLMC) is a Markov chain Monte Carlo (MCMC) algorithm designed for efficient sampling from probability distributions defined or constrained on convex sets, most commonly when the target distribution is log-concave and supported on a compact convex body in high-dimensional Euclidean space. The core innovation of PLMC is the integration of a projection step into the classical Langevin Monte Carlo framework, allowing it to sample efficiently under hard constraints.
1. Definition and Algorithmic Framework
PLMC extends Langevin Monte Carlo by constraining the support using projection—most commonly, the Euclidean projection—onto a convex set . Let the target density be
where is convex and sufficiently smooth (e.g., -Lipschitz and -smooth).
The PLMC iteration is
where is the step size, , and denotes the Euclidean projector onto .
PLMC thus augments the classical overdamped Langevin step with a projection to ensure all iterates remain inside . This is structurally analogous to projected stochastic gradient descent (SGD) used in constrained optimization.
2. Theoretical Guarantees and Mixing Times
The principal result for PLMC is that, under convexity and smoothness of , and when is convex and compact, PLMC mixes in polynomial time from any initial position:
- For the uniform case (), PLMC achieves total variation distance at most from the uniform measure on in
steps, with step size , where is the diameter of (Bubeck et al., 2015 ).
- For general log-concave targets,
The dependence on and the smoothness parameters is explicit and polynomial.
PLMC is thus a polynomial-time sampler under conditions typical for optimization and convex geometry.
3. Analytical Underpinnings and Methodological Insights
PLMC is rigorously analyzed as the discretization of reflected Brownian motion with drift, a stochastic process that respects hard constraints through boundary reflection. The analytical methods involve coupling arguments for mixing, properties of reflected diffusions, and careful handling of singularities at boundaries (Tanaka drift).
The transition kernel of PLMC is local, with each proposal depending only on the current state via drift, noise, and projection.
4. Comparisons: PLMC, Hit-and-Run, and Oracles
PLMC can be contrasted with hit-and-run random walks:
- Hit-and-run enjoys optimal mixing time for convex bodies ([Lovász & Vempala, 2007]), whereas PLMC achieves in the uniform case, with higher exponents for general log-concave densities.
- Hit-and-run requires only a zeroth-order oracle (evaluation of ), while PLMC requires a first-order oracle (), aligning it with the toolkit of modern (stochastic) optimization.
In practical experiments, PLMC is at least as fast as hit-and-run in wall-clock time and often faster, although hit-and-run may offer slightly better estimation accuracy for volume computations (Bubeck et al., 2015 ).
5. Extensions and Recent Advances
Recent work has extended and generalized PLMC significantly:
- Nonconvex and Superlinear Potentials: PLMC has been extended to settings where is non-convex and the drift grows polynomially fast (superlinear), addressing instability in naive discretizations. The explicit PLMC with tailored projection guarantees total variation convergence even in these challenging regimes (Pang et al., 2023 ).
- Regularity estimates for Kolmogorov equations (time-independent): Critical for bounding the discretization error in total variation, especially for nonconvex or non-Lipschitz (Pang et al., 2023 , Sabanis et al., 2018 ).
- Multilevel Monte Carlo (MLMC): Frameworks for PLMC with MLMC coupling provide optimal sampling complexities (), leveraging contractivity and uniform-in-time variance bounds, as long as strong concavity (or analogous properties under projection) holds (Giles et al., 2016 ).
The algorithmic step in superlinear drift settings is: where the projection restricts the iterates to a compact set whose radius depends polynomially on , preventing divergence (Pang et al., 2023 ).
6. Performance, Complexity, and Practical Implications
Method | Mixing Time (steps) | Oracle | Empirical Runtime | Accuracy |
---|---|---|---|---|
PLMC (uniform) | 1st order | Comparable to HR | Slightly less | |
PLMC (log-conc) | 1st order | At least as fast as HR | Near-optimal | |
Hit-and-run | 0th order | Baseline | Slightly higher |
HR—hit-and-run random walk; runtime and accuracy based on empirical evaluation on convex bodies (Bubeck et al., 2015 ).
Practical guidance for high-dimensional settings:
- For low regularity or superlinear drift, use the explicit, projected variant with step size and projection radius tuned to dimension and drift growth (Pang et al., 2023 ).
- In the log-concave case, PLMC offers a scalable first-order method with robust polynomial-time mixing guarantees, including explicit total variation error bounds.
7. Applications and Future Directions
PLMC has substantial applications in:
- Bayesian computation: Sampling from constrained posteriors with support on convex domains.
- Machine learning: Algorithms analogous to projected SGD (PLMC vs. SGLD) for MCMC-based inference under constraints.
- Computational convex geometry: Randomized algorithms for volume estimation and integration.
Further research directions include:
- Reducing the dimension exponents in mixing times, possibly closing the gap with hit-and-run.
- Extending PLMC variants to handle nonsmooth, composite, or even discontinuous potentials using Bregman projections (Lau et al., 2022 ).
- Improved non-asymptotic total variation bounds and the development of higher-order, tamed, or multilevel PLMC algorithms (Sabanis et al., 2018 , Pang et al., 2023 , Giles et al., 2016 ).
Summary
Projected Langevin Monte Carlo is a rigorous, efficient, and general-purpose method for first-order sampling in constrained domains. Its theory and empirical performance position it as a foundational tool in the growing intersection of high-dimensional MCMC, optimization, and convex geometry, with diverse applications in statistics and machine learning. Recent developments extend its reach far beyond the original log-concave paradigm, enabling robust sampling in nonconvex and nonsmooth environments with explicit error controls.