Diffusion Annealed Langevin Monte Carlo
- Diffusion Annealed Langevin Monte Carlo (DALMC) is a robust MCMC algorithm that fuses annealed diffusion paths with discretized Langevin dynamics to sample from complex, high-dimensional distributions.
- It constructs a continuous interpolation between a tractable base distribution and the target law using a diffusion schedule, effectively mitigating challenges in multimodal and constrained sampling scenarios.
- The method employs learned or Monte Carlo-based score estimators and offers rigorous, non-asymptotic convergence guarantees under polynomial-time complexity.
Diffusion Annealed Langevin Monte Carlo (DALMC) denotes a class of Markov Chain Monte Carlo (MCMC) algorithms that generate approximate samples from complex target distributions by leveraging annealed (diffusion-inspired) paths and discretized Langevin dynamics. DALMC was developed to robustly bridge the principles of modern score-based generative diffusion models and classical MCMC, overcoming bottlenecks in high-dimensional, multimodal, and constrained sampling scenarios encountered in Bayesian inference, inverse problems, and generative modeling. It achieves this by sequentially interpolating between an accessible base distribution and the target law, using stochastic differential equation (SDE) discretizations parameterized by a diffusion schedule and driven by (approximate) time-dependent score functions.
1. Mathematical Framework and Diffusion Path Construction
DALMC formulates the sampling problem as follows: given a target density on —often accessible only up to a normalization constant—the goal is to approximate the law of in total variation or Wasserstein distance. The core of DALMC is the construction of an interpolating path of distributions connecting a tractable base density (commonly Gaussian or heavy-tailed Student's t) and , often through convolutional paths:
where is an increasing, smooth schedule (e.g., cosine or sigmoid functions). For diffusion models, the law at can be explicitly represented as
The reverse-time SDE that transports samples from 0 to 1 is given by
2
requiring access to the marginal score function at each 3 (Young et al., 29 Jan 2026, Cordero-Encinar et al., 13 Feb 2025).
2. Discretized Langevin Dynamics and Algorithmic Structure
DALMC leverages a time-discretized Euler–Maruyama scheme to realize unadjusted Langevin steps adapted to the interpolating path. Defining 4 steps on a grid 5 with increments 6, each iterate is updated via
7
where 8 approximates 9. Step sizes and annealing schedules are adapted to control bias and discretization error, with 0 typical under cosine schedules (Young et al., 29 Jan 2026, Diamond et al., 21 Nov 2025).
For conditional/posterior sampling tasks (e.g., 1), an annealing schedule is constructed for the measurement noise; the path proceeds by updating effective measurements through additive noise decrements, and the score combines the learned prior and explicit data-dependent terms (Xun et al., 30 Oct 2025).
3. Score Approximation and Sequential Monte Carlo Estimation
Because 2 is generally intractable, score estimation is executed via learned neural networks (as in diffusion models) or, for general unnormalized targets, via sequential Monte Carlo (SMC). The SMC approach constructs auxiliary “posterior” distributions 3; the score at 4 is estimated as the Monte Carlo average of a test function 5 over particles 6.
Variance reduction is achieved using control variates; DALMC introduces matrix-valued schedules 7 blending the denoising identity and target-score, optimized to minimize the estimator’s variance via the ratio of Fisher information matrices or direct cross-covariance (Young et al., 29 Jan 2026).
4. Theoretical Guarantees and Non-Asymptotic Error Bounds
Rigorous non-asymptotic convergence results underpin DALMC in both log-concave and more general smooth, possibly multimodal, target settings. Critical elements include:
- Score Error: For learned scores entering DALMC, no 8 or exponential moment (MGF) condition is required. An 9 error control—specifically 0—is sufficient to guarantee polynomial-time sampling in global log-concave settings (Xun et al., 30 Oct 2025).
- KL and TV Control: The path-space Kullback–Leibler (KL) divergence between the DALMC law and the ideal reference is bounded as a sum of three terms: bias from annealing (1, 2 the path action in 3), discretization error (4 for 5 steps), and score approximation error (6) (Cordero-Encinar et al., 13 Feb 2025, Guo et al., 2024, Young et al., 29 Jan 2026).
- Iteration Complexity: Sample complexity is polynomial in 7, the action 8, smoothness 9, and 0 to reach KL accuracy 1 under β-smoothness and finite second moments, with no log-concavity or isoperimetry required (Guo et al., 2024, Cordero-Encinar et al., 13 Feb 2025).
- Posterior Sampling Robustness: DALMC decomposes long mixing paths into a sequence of short hops between nearby intermediates, keeping score error under control, in contrast to vanilla Langevin which may contract and expand the law off-manifold and become brittle to score estimation (Xun et al., 30 Oct 2025).
5. DALMC in Conditional Inference and Inverse Problems
DALMC is specialized for posterior sampling under models with a noisy linear measurement—2—by annealing the noise variance to traverse from a relaxed likelihood to the full posterior. The algorithm initializes with a sample from the unconditional prior via a diffusion model and then walks through annealed noise levels, each time applying Langevin iterations targeting the conditional law at that level. This hierarchical approach ensures initialization “on the manifold” and robust, polynomial-time convergence in global log-concave regimes when scores satisfy the 3 condition (Xun et al., 30 Oct 2025).
Key components for practical success include:
- Number of annealing steps 4, where 5,
- Per-level mixing times and discretization to control both mixing and discretization errors,
- Tasks such as inpainting, super-resolution, and deblurring, outperforming diffusion posterior sampling benchmarks in per-image 6 error and FID after sufficient steps (Xun et al., 30 Oct 2025).
6. Extensions to Heavy-Tailed Distributions and Generative Modeling
DALMC supports flexible choices of base laws, enabling sampling under heavy-tailed targets via Student’s t convolutions instead of Gaussian paths. The Student-t path is especially effective when 7 is heavy-tailed, as Gaussian interpolants cannot control tail behavior robustly. Convergence and complexity results remain valid under analogous moment and smoothness assumptions; action computations and functional inequalities are adapted accordingly (Cordero-Encinar et al., 13 Feb 2025).
In high-dimensional generative modeling, DALMC allows path schedules (e.g., slow growth of 8 at endpoints via cosine schedules) to limit intermediate score norms and discretization error. However, the explicit Euler–Maruyama discretization yields less favorable scaling than reverse-SDE diffusion models with exponential integrators, partially limiting DALMC in large-scale applications. Nevertheless, DALMC avoids pathologies of SDEs with singular drifts at endpoints and remains numerically stable (Cordero-Encinar et al., 13 Feb 2025).
7. Empirical Benchmarks and Molecular Dynamics Correspondence
Empirical assessments of DALMC reveal competitive or superior sample quality (e.g. in 9 distance, KS statistic, and predictive likelihood) compared to annealed importance sampling, SMC, and reverse-diffusion Monte Carlo baselines, with substantial reductions in batched energy evaluations (Young et al., 29 Jan 2026). DALMC has also been interpreted as a learned, data-driven molecular dynamics integrator: one reverse diffusion step with quadratic “adapter” is exactly an Euler–Maruyama step for overdamped Langevin, with error decomposing into model (drift) and discretization terms. Practical application to molecular systems confirms that DALMC can produce trajectories with physically meaningful time correlations at computational costs far below conventional MD, given sufficient model capacity and steps (Diamond et al., 21 Nov 2025).
References:
- "Posterior Sampling by Combining Diffusion Models with Annealed Langevin Dynamics" (Xun et al., 30 Oct 2025)
- "Non-asymptotic Analysis of Diffusion Annealed Langevin Monte Carlo for Generative Modelling" (Cordero-Encinar et al., 13 Feb 2025)
- "Diffusion Models are Molecular Dynamics Simulators" (Diamond et al., 21 Nov 2025)
- "Diffusion Path Samplers via Sequential Monte Carlo" (Young et al., 29 Jan 2026)
- "Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling" (Guo et al., 2024)