SCLD: Controlled Langevin Diffusion

Updated 28 January 2026

SCLD is a framework that integrates sequential Monte Carlo with controlled Langevin dynamics to sample from unnormalized target densities.
It employs a continuous-time path-space formulation with learnable drift functions, systematic resampling, and log-variance loss to control bias and variance.
Empirical benchmarks demonstrate that SCLD achieves efficient, high-quality sampling in complex, high-dimensional settings while significantly reducing computational cost.

Sequential Controlled Langevin Diffusion (SCLD) is a principled framework for sampling from unnormalized target densities by synthesizing the asymptotic robustness of Sequential Monte Carlo (SMC) with the flexibility and adaptivity of diffusion-based sampling methods. SCLD introduces a continuous-time path-space perspective, leveraging learnable Langevin drifts and systematic resampling to achieve efficient, high-quality sampling in complex and high-dimensional distributions, with significant reduction in computational cost compared to traditional diffusion samplers (Chen et al., 2024).

1. Problem Setting and Interpolating Distributions

The fundamental objective in SCLD is to generate samples from an unnormalized density

$p_{\rm target}(x) = \frac{\rho_{\rm target}(x)}{Z}, \qquad Z = \int_{\mathbb{R}^d} \rho_{\rm target}(x)\,\mathrm{d}x,$

where $\rho_{\rm target}(x)$ is tractable but $Z$ is unknown. SCLD introduces a continuous-time annealing process via the interpolant

$\pi(x,t) \propto p_{\rm prior}(x)^{1-\beta(t)}\,\rho_{\rm target}(x)^{\beta(t)}, \qquad t \in [0, T],\quad \beta(0)=0,\;\beta(T)=1,$

such that $\pi(\cdot, 0) = p_{\rm prior}$ (typically Gaussian) and $\pi(\cdot, T) = p_{\rm target}$ . The schedule function $\beta(t)$ can be linear or learned, providing a smooth progression from easy-to-sample priors to the complex target (Chen et al., 2024).

2. Path-Space Formulation and SDEs

SCLD models the annealing process on path space using controlled Langevin stochastic differential equations (SDEs):

Forward SDE:

$\mathrm{d}X_t = b_t(X_t)\,\mathrm{d}t + \sqrt{2}\,\mathrm{d}W_t, \qquad X_0 \sim p_{\rm prior}$

where $b_t:\mathbb{R}^d \to \mathbb{R}^d$ is a learnable time-dependent drift.

Backward SDE (via Nelson’s identity):

$\mathrm{d}Y_t = \bigl(b_t(Y_t) - 2 \nabla \log \pi(Y_t, t)\bigr)\,\mathrm{d}t + \sqrt{2}\,\mathrm{d}\widetilde W_t, \qquad Y_T \sim p_{\rm target}$

The unique optimal control $\rho_{\rm target}(x)$ 0 equalizes the forward and reverse path measures, representing the solution to the continuous-time Schrödinger bridge (Chen et al., 2024).

3. Variational Objective and Log-Variance Loss

The learning problem for the drift $\rho_{\rm target}(x)$ 1 is formulated via a variational objective on path space. Let $\rho_{\rm target}(x)$ 2 and $\rho_{\rm target}(x)$ 3 denote the forward and reverse SDE path laws over $\rho_{\rm target}(x)$ 4. The Radon–Nikodym derivative (RND) between reverse and forward path measures is

$\rho_{\rm target}(x)$ 5

One minimizes variational losses such as $\rho_{\rm target}(x)$ 6 or the log-variance loss $\rho_{\rm target}(x)$ 7. Empirically the log-variance loss exhibits superior scaling (polynomial, not exponential variance) in high dimension (Chen et al., 2024).

4. Algorithmic Construction: SMC Integration and Discretization

SCLD divides $\rho_{\rm target}(x)$ 8 into $\rho_{\rm target}(x)$ 9 subintervals $Z$ 0. For each, the RND over the interval is defined, allowing the total path weight to factorize:

$Z$ 1

Particles are propagated forward following discretized Euler–Maruyama steps:

$Z$ 2

with step size $Z$ 3 for subinterval length $Z$ 4. Weights are updated via the discretized RND, and resampling is triggered whenever the effective sample size (ESS) $Z$ 5 falls below a fixed threshold (e.g., $Z$ 6). Optionally, a few MCMC steps targeting $Z$ 7 are applied after resampling to enhance mixing (Chen et al., 2024).

Pseudocode for SCLD is as follows:

$\pi(x,t) \propto p_{\rm prior}(x)^{1-\beta(t)}\,\rho_{\rm target}(x)^{\beta(t)}, \qquad t \in [0, T],\quad \beta(0)=0,\;\beta(T)=1,$ 3 (Chen et al., 2024)

5. Theoretical Properties and Bias–Variance Tradeoff

If the drift $Z$ 8 exactly solves the path-space Schrödinger bridge, then all path weights $Z$ 9 become unity and no resampling is required, yielding perfect finite-time transport. In practice, unbiasedness for expectations under $\pi(x,t) \propto p_{\rm prior}(x)^{1-\beta(t)}\,\rho_{\rm target}(x)^{\beta(t)}, \qquad t \in [0, T],\quad \beta(0)=0,\;\beta(T)=1,$ 0 is retained. Theoretical analysis establishes that the variance of the KL-loss-based estimator grows exponentially with dimension when relying on interval reweighting, but the log-variance loss reduces this scaling to polynomial, which is critical for high-dimensional efficiency (Proposition 5.7 in (Chen et al., 2024)). Discretization bias can be controlled by adjusting the Euler–Maruyama step size, trading computation for accuracy.

6. Empirical Performance, Robustness, and Benchmarks

SCLD has been evaluated across 11 benchmarks, covering Bayesian logistic and random-effect models (up to 1,600 dimensions), synthetic multimodal Gaussian mixtures (up to 40 modes in 50 dimensions), and robotic-arm motion planning with sharp, separated modes. Evaluation metrics include ELBO (for estimation of $\pi(x,t) \propto p_{\rm prior}(x)^{1-\beta(t)}\,\rho_{\rm target}(x)^{\beta(t)}, \qquad t \in [0, T],\quad \beta(0)=0,\;\beta(T)=1,$ 1) when $\pi(x,t) \propto p_{\rm prior}(x)^{1-\beta(t)}\,\rho_{\rm target}(x)^{\beta(t)}, \qquad t \in [0, T],\quad \beta(0)=0,\;\beta(T)=1,$ 2 is unknown, and Sinkhorn–Wasserstein distance to ground-truth samples for others.

Empirically, SCLD attains or surpasses the performance of classical SMC, CRAFT, DDS, PIS, and CMCD variants, often requiring only 10% of the gradient computation budget of pure diffusion-based samplers (e.g., SCLD achieves convergence with approximately 3,000 steps vs. 40,000). Convergence rate improvements of 5–10x in ELBO wall-clock time are observed relative to ULA-based SMC (Chen et al., 2024). The robustness of SCLD is attributed to synergistic use of log-variance loss, replay buffer, systematic resampling, and occasional MCMC corrections; ablation studies confirm the necessity of each component.

7. Relationship to Prior Sequential Monte Carlo and Langevin Approaches

The SCLD method unifies principles from both SMC—such as particle propagation, adaptive resampling, and importance weighting—and controlled Langevin dynamics, in the sense of learned, time-dependent drift functions governing the proposals. Earlier approaches (e.g., the controlled-Langevin SMC model of (Septier et al., 2015)) adopted discrete-time, stepwise controlled Langevin mutation for Bayesian filtering, combining mutation, weighting, and Metropolized-Langevin moves. SCLD distinguishes itself by explicitly casting the problem in continuous time, learning the drift to optimize path-space transport, and integrating a low-variance loss on trajectories. Both frameworks employ effective sample size to guide resampling and deploy flexible drift adjustments to counteract sample impoverishment in high dimensions. However, SCLD's use of a learnable drift network and variational path-space loss provides additional adaptivity and computational efficiency (Chen et al., 2024, Septier et al., 2015).

Markdown Report Issue Upgrade to Chat

References (2)

Sequential Controlled Langevin Diffusions (2024)

Langevin and Hamiltonian based Sequential MCMC for Efficient Bayesian Filtering in High-dimensional Spaces (2015)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Sequential Controlled Langevin Diffusion (SCLD).