Diffusion-Based Posterior Sampling

Updated 22 December 2025

Diffusion-based posterior sampling is a framework that uses probabilistic diffusion to connect tractable priors to complex Bayesian posteriors, enabling scalable uncertainty quantification.
The PDPS method leverages Monte Carlo sampling and Langevin dynamics to estimate posterior scores, delivering non-asymptotic error bounds and superior performance in image restoration.
A three-stage sampling procedure with warm-start initialization mitigates bias and ensures polynomial-time convergence for high-dimensional, multimodal inverse problems.

Diffusion-based posterior sampling encompasses a family of transport and stochastic simulation methods, where a probabilistic diffusion process connects a tractable reference density to a complex Bayesian posterior. In this context, the prior is often modeled by a data-driven score-based diffusion model, and the measurement likelihood is incorporated through Monte Carlo estimators, Langevin dynamics, or optimization-based surrogates for conditional score approximations. This methodology enables scalable uncertainty quantification and generative posterior inference in general nonlinear, noisy inverse problems, and Bayesian inverse problems with multi-modal, high-dimensional targets. Rigorous non-asymptotic error bounds now exist, together with practical plug-and-play implementation strategies. The framework is exemplified by Provable Diffusion Posterior Sampling (PDPS), which provides polynomial-time convergence and empirical superiority in image restoration tasks (Chang et al., 8 Dec 2025).

1. Mathematical Formulation: Bayesian Posterior and Diffusion Transport

The diffusion-based posterior sampling framework considers observations $y \in \mathbb{R}^n$ generated via

$Y = \mathcal{G}(X_0) + n,\quad n \sim \rho(n),$

with a known forward operator $\mathcal{G}: \mathbb{R}^d \to \mathbb{R}^n$ and noise density $\rho$ . The posterior distribution is

$\pi(x_0|y) \propto \exp(-\ell_y(x_0))\,\pi_0(x_0),\quad \ell_y(x_0) = -\log\rho(y - \mathcal{G}(x_0)),$

where $\pi_0$ denotes a pretrained, data-driven prior. To construct a transport from a tractable terminal density $\pi_T$ back to the posterior, the process evolves under

$dX_t = -X_t\,dt + \sqrt{2}\,dB_t,\quad X_0 \sim \pi_0,\quad t \in [0, T],$

where $X_T \sim \pi_T$ . The time-reversal process starting from $\pi_T$ is then

$d\bar{X}_t = (\bar{X}_t + 2\nabla_x\log q_{T-t}(\bar{X}_t|y))\,dt + \sqrt{2}\,dB_t,$

with initial condition $\bar{X}_0 \sim \pi_T(\cdot|y)$ and $\bar{X}_T \sim \pi(\cdot|y)$ .

2. Posterior Score Estimation via Restricted Gaussian Oracle (RGO) and Monte Carlo

A central challenge is the estimation of the time-dependent posterior score

$s_t(x;y) = \nabla_x\log q_t(x|y),$

where $q_t(x|y)$ is the joint time- $t$ marginal. Utilizing a conditional Tweedie identity yields

$s_t(x;y) = -\frac{x - \mu_t D(t, x, y)}{\sigma_t^2},\quad D(t, x, y) = \mathbb{E}[X_0|X_t = x, Y = y],$

with $\mu_t = e^{-t}$ and $\sigma_t^2 = 1 - e^{-2t}$ . The denoiser expectation $D(t, x, y)$ is defined under the "restricted Gaussian oracle" (RGO) density

$p_t(x_0|x, y) \propto \exp\left(-\frac{\|x - \mu_t x_0\|^2}{2\sigma_t^2} - \ell_y(x_0)\right)\pi_0(x_0).$

Practically, $p_t(\cdot|x, y)$ is approximated by Monte Carlo sampling, specifically by generating samples from a Langevin SDE: $dX_{0,s} = (\nabla\log\pi_0(X_{0,s}) + \frac{\mu_t}{\sigma_t^2}(x - \mu_t X_{0,s}) - \nabla\ell_y(X_{0,s}))\,ds + \sqrt{2}\,dB_s,\quad s \in [0, S].$ With prior score replaced by a pretrained estimator $s_{\mathrm{prior}}$ , the denoiser and score are estimated as

$\widehat{D}_m^S(t, x, y) = \frac{1}{m}\sum_{i=1}^m X_{0,S,i}^{x, y, t},\quad \widehat{s}_m^S(t, x, y) = -\frac{x - \mu_t \widehat{D}_m^S(t, x, y)}{\sigma_t^2}.$

3. Warm-Start Initialization and Three-Stage Sampling Procedure

Since $\pi_T(\cdot|y)$ may deviate from a standard Gaussian for small $T$ , PDPS avoids bias by initializing the reverse chain with a "warm start." This is accomplished via an outer Langevin chain using the estimated posterior score: $dX_{T,u} = \widehat{s}_m^S(T, X_{T,u}, y)\,du + \sqrt{2}\,dB_u,\quad u \in [0, U],$ ensuring $\bar{X}_0 \sim \pi_T(\cdot|y)$ approximately, given sufficient mixing time $U$ . The complete algorithm entails: (i) warm-start Langevin to sample $\bar{X}_0$ , (ii) time-reversal diffusion using $\widehat{s}_m^S$ for $t \in [0, T-T_0]$ , and (iii) a final scaling/denoising step.

4. Non-Asymptotic Convergence and Error Bounds

PDPS delivers rigorous non-asymptotic error bounds in 2-Wasserstein distance: $W_2(\pi(\cdot|y),\,\widehat{\pi}) \leq C_1(T_0) + C_2\int_{T_0}^T \mathbb{E}\|\widehat{s}_m^S(t)-s_t\|^2 + C_3 e^{-U/C_{\mathrm{LSI}}},$ where errors decompose into early-stop, score-estimation, and warm-start errors. Under the following conditions:

$\pi(\cdot|y)$ is $\alpha$ -semi-log-concave, sub-Gaussian tails ( $V_{\mathrm{SG}}$ ),
Prior score-matching error $\varepsilon_{\mathrm{prior}}$ ,
Conditioning $\kappa_y < \infty$ : for any $\varepsilon > 0$ , selecting

$T_0 \sim \sqrt{\varepsilon},\quad U \sim \log(1/\varepsilon),\quad m \sim \varepsilon^{-2},\quad S \sim \log(1/\varepsilon)$

yields final error

$W_2(\pi(\cdot|y),\,\widehat{\pi}) \leq C\,\varepsilon^{1/4} \sqrt{\log(1/\varepsilon)}$

with constants $C$ polynomial in $(\kappa_y, \varepsilon_{\mathrm{prior}}, \alpha, V_{\mathrm{SG}})$ , dimension-free.

5. Numerical Benchmarks: Image Deblurring and Robustness

PDPS was evaluated on FFHQ $64\times64$ images for Gaussian, motion, and nonlinear (GOPRO) blur. In all scenarios, PDPS exceeded classical TV and prior DPS in PSNR and SSIM:

Method	Gaussian	Motion	Nonlinear
TV	23.95/0.81	24.65/0.80	19.70/0.53
DPS	24.15/0.81	26.66/0.88	20.93/0.68
PDPS (ours)	26.42/0.87	28.86/0.92	28.44/0.91

PDPS generated reconstructions with finer texture, fewer artifacts, and pixel-wise uncertainty maps. Robustness was confirmed under cross-dataset prior mismatch, retaining a decisive PSNR/SSIM advantage.

6. Structural Design Considerations and Practical Implications

Key insights from PDPS highlight the necessity of:

Small diffusion time $T$ : Ensures the log-concavity of the RGO target for Langevin accuracy, yet large enough for log-Sobolev warm-start properties.
Plug-and-play modularity: Decoupled prior learning enables generalization to arbitrary likelihoods post-prior training.
Three-stage procedure: Inner Langevin (RGO), outer Langevin (warm start), coupled with reverse diffusion, admits non-asymptotic error control even for multimodal posteriors.
Computational scaling: Monte Carlo sample size $m$ and inner Langevin time $S$ scale logarithmically with precision, rendering the sampler practical for high-fidelity posterior inference.

7. Significance: Advances Over Previous Heuristic Methods

PDPS resolves several open issues in diffusion posterior sampling by (i) eliminating heuristic score and likelihood approximations, (ii) providing theoretical guarantees for convergence and uncertainty quantification, and (iii) demonstrating empirical improvements in both accuracy and robustness. The approach generalizes to inverse problems with complex forward models and arbitrary likelihoods, subject only to regularity assumptions and log-concavity, setting a new standard in Bayesian inversion.

8. Outstanding Challenges and Future Directions

Open directions include further optimizing parallelization for large-scale inverse problems, extending multi-modal uncertainty quantification in non-log-concave settings, and integrating advanced score-matching estimators for settings with challenging priors. Analysis of lower bounds and computational barriers for worst-case priors remains salient, informed by hardness results established in cryptographic complexity (Gupta et al., 2024). The extension of PDPS methodology to hierarchical models and simulation-based inference is another promising avenue for future research.

The PDPS framework synthesizes data-driven priors, rigorous Monte Carlo posterior score estimation, and non-asymptotic transport analysis, yielding a theoretically founded, practical, and robust Bayesian inversion sampler (Chang et al., 8 Dec 2025).

Markdown Upgrade to Chat

References (2)

Provable Diffusion Posterior Sampling for Bayesian Inversion (2025)

Diffusion Posterior Sampling is Computationally Intractable (2024)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Diffusion-Based Posterior Sampling.