Papers
Topics
Authors
Recent
Search
2000 character limit reached

Diffusion Posterior Sampling Overview

Updated 28 June 2026
  • Diffusion posterior sampling is a framework that integrates trained diffusion models with likelihood guidance to perform Bayesian posterior inference in inverse problems.
  • It employs a reverse diffusion process augmented with measurement gradients, yielding sharp reconstructions while managing varying noise levels.
  • The approach is plug-and-play, leveraging pre-trained diffusion priors to flexibly handle different operators and noise regimes without task-specific retraining.

Diffusion posterior sampling (DPS) refers to a spectrum of methodologies that utilize diffusion-based generative models to perform posterior inference in inverse problems, where the target is sampling from p(xy)p(yx)p(x)p(x \mid y) \propto p(y \mid x) p(x). These methods combine a learned diffusion prior with explicit or implicit likelihood guidance, enabling flexible, plug-and-play solutions to imaging and other high-dimensional Bayesian inverse tasks. DPS distinguishes itself from traditional MCMC and purely variational approaches by leveraging the strong data priors encoded in modern diffusion models, and by accommodating a wide variety of degradation operators and noise statistics without task-specific retraining.

1. Core Principles and Mathematical Formulation

The foundational setup in DPS considers an unknown signal x0x_0 observed through a forward operator HH and additive noise nn as y=Hx0+n, nN(0,σ2I)y = H x_0 + n, \ n \sim \mathcal{N}(0, \sigma^2 I). The resulting posterior has the form

p(xy)exp(12σ2yHx22)p(x),p(x \mid y) \propto \exp\left(-\frac{1}{2\sigma^2} \|y - H x\|_2^2 \right) p(x),

where p(x)p(x) is implicitly represented by a pre-trained score-based diffusion model. Sampling from p(xy)p(x \mid y) thus requires combining the diffusion prior with a likelihood-guided correction.

DPS operates by augmenting the standard reverse-time stochastic differential equation (SDE) or discrete Markov chain of the diffusion model with a measurement-consistency term:

xt1=xt+αtsθ(xt,t)+ρβtxlogp(yxt)+γtzt,x_{t-1} = x_t + \alpha_t s_\theta(x_t, t) + \rho \beta_t \nabla_x \log p(y \mid x_t) + \sqrt{\gamma_t} z_t,

where sθs_\theta is the trained score (denoising) network, x0x_00 are time-dependent coefficients, x0x_01 is a "guidance scale" balancing prior and data fidelity, and x0x_02 represents injected noise. For Gaussian x0x_03, the log-likelihood gradient is explicit: x0x_04 (Syarubany, 25 Dec 2025).

In general, the reverse SDE or DDPM update is conditioned on both the data and the current estimate, often using Tweedie's formula for the posterior mean estimation in the latent space.

2. Algorithmic Construction and Conditioning Strategies

A typical DPS algorithm proceeds as follows:

  • Initialization: Begin with x0x_05 corresponding to pure noise.
  • Reverse Diffusion with Likelihood Guidance: For each time step x0x_06:

    1. Compute the prior-driven denoising step.
    2. Calculate the measurement likelihood gradient at the current x0x_07.
    3. Update x0x_08 by combining the denoising and measurement gradients with stochasticity.
  • Parameter Selection: The guidance scale x0x_09 and the noise standard deviation HH0 are tuned to balance data fidelity and stability. Excessive HH1 may induce artifacts, while insufficient HH2 leads to under-enforced measurement consistency (Syarubany, 25 Dec 2025).

Alternative conditioning mechanisms include:

  • Manifold-Constrained Gradient (MCG): Enforces measurement consistency by a hard projection after each step; can amplify high-frequency noise under additive noise models (Syarubany, 25 Dec 2025).
  • Annealed Guidance: Varies HH3 over diffusion time to accommodate different regularization strengths at different scales; excessive smoothness in scheduling can under-enforce consistency (Syarubany, 25 Dec 2025).
  • Direct Likelihood Approximations: For non-Gaussian or nonlinear measurements (e.g., Poisson or nonlinear tomography), the measurement gradient is formulated via chain rule using Jacobians of the conditional denoiser (Li et al., 2023).

3. Theoretical Properties and Practical Performance

DPS provides a flexible framework in which the pre-trained prior remains untouched for each inverse problem, allowing zero-shot application to new operators and noise statistics. Empirical ablations demonstrate:

  • Optimal performance at moderate guidance scales: On HH4 super-resolution with additive Gaussian noise, the best performance is achieved at HH5 and HH6, with combined metric score HH7 (Syarubany, 25 Dec 2025).
  • Qualitative Structure Restoration: With optimal parameters, DPS reconstructs sharp edges and coherent mid-frequency details that are suppressed in the downsampled inputs. Alternative methods either introduce oscillatory artifacts or struggle with texture fidelity.
  • Sensitivity to Noise and Guidance Hyperparameters: Larger HH8 attenuates the measurement signal and degrades PSNR/SSIM; too small HH9 can cause overfitting. Overly large nn0 produces instabilities and visual artifacts (Syarubany, 25 Dec 2025).

In comparison to methods involving hard projection or scheduled annealing, fixed-moderate DPS achieves a better trade-off between high-frequency restoration and overall image quality in standard DDPM settings.

4. Extensions and Application Domains

DPS has demonstrated competitive or superior results in:

  • Single and Multi-measurement Bayesian inverse problems: Including super-resolution, CT/MRI reconstruction, deblurring, phase retrieval, and inpainting (Syarubany, 25 Dec 2025, Chung et al., 2022, Li et al., 2023).
  • Nonlinear and Non-Gaussian Forward Models: DPS seamlessly extends to nonlinear operators and signal-dependent noise regimes (Poisson), using backpropagation through neural decoders and auxiliary chain rules for measurement gradient calculation (Li et al., 2023).
  • Plug-and-Play Inference: The framework requires no retraining of the diffusion model for each measurement operator or noise model; measurement-conditioning is implemented at inference time (Syarubany, 25 Dec 2025, Chung et al., 2022).
  • Algorithmic Variants: DPS serves as a building block for advanced posterior inference techniques such as simulation-based inference in tall-data settings (with score model aggregation), sequential/temporal inverse problems (with transition models), and hybrid Langevin-diffusion samplers for enhanced MCMC mixing (Linhart et al., 2024, Stevens et al., 2024, Zhao, 1 Jun 2025).

5. Open Problems, Limitations, and Practical Recommendations

Despite its generality, DPS is subject to several practical and theoretical considerations:

  • Operator and Noise Model Match: Careful tuning of nn1 is required to avoid under- or overfitting to the measured data in mismatched scenarios (Syarubany, 25 Dec 2025).
  • Choice of Guidance Scale nn2: Empirical evidence recommends values just below one (e.g., nn3) to maximize reconstruction fidelity while maintaining stability (Syarubany, 25 Dec 2025).
  • No Diverse Posterior Samples: Standard DPS may behave like a MAP estimator—consistently producing sharp, high-quality samples with limited diversity, rather than true posterior draws. This effect has been observed and quantified empirically, and algorithms incorporating additional randomness or explicit posterior sampling steps are needed for proper uncertainty quantification (Xu et al., 31 Jan 2025).
  • No Retraining, but Potential Error Accumulation: Blending prior and measurement-gradient guidance avoids the need for specialized training, but may accumulate errors or lead to suboptimal sample diversity, particularly in highly ill-posed or nonidentifiable inverse problems (Syarubany, 25 Dec 2025, Chung et al., 2022).
  • Practical Guidance:
    • Tune hyperparameters to maximize task-specific fidelity metrics.
    • Begin with guidance scale nn4 and noise level nn5 set to sensor characteristics.
    • Avoid retraining priors; leverage DPS as an inference-time plug-and-play method with clear diagnostic ablation (Syarubany, 25 Dec 2025).

6. Representative Quantitative Results

PS-scale nn6 Noise nn7 Combined Score (PSNR/40 + SSIM)
0.95 0.01 1.45231 (best)
0.90 0.01 1.44452
0.80 0.01 1.42857
0.50 0.05 1.32122
0.20 0.05 1.16456

Moderate guidance scales and low observation noise yield the best combined metric. Decreasing nn8 or increasing nn9 degrades both distortion and perceptual quality (Syarubany, 25 Dec 2025).

7. Impact and Outlook

Diffusion posterior sampling has established itself as a highly effective paradigm in computational imaging and Bayesian inverse problems, enabling high-quality reconstructions in challenging regimes and broadening the applicability of diffusion models as plug-and-play priors. By balancing denoising prior information and explicit likelihood constraints, DPS offers a principled yet flexible alternative to classic MAP optimization or MCMC approaches, without the need for retraining on each new measurement scenario.

Significant ongoing research aims at improving posterior sample diversity, theoretical error bounds, and generalizing DPS frameworks to settings involving nonlinearities, non-Gaussian noise, and high-dimensional, multimodal posteriors. Recent empirical and theoretical analyses clarify both the strengths and subtle limitations of DPS—especially its tendency toward MAP-like solutions and sensitivity to guide scale selection—providing directions for next-generation diffusion-based Bayesian inference (Syarubany, 25 Dec 2025, Xu et al., 31 Jan 2025).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Diffusion Posterior Sampling.