Papers
Topics
Authors
Recent
Search
2000 character limit reached

Score-Based Diffusion Models

Updated 3 June 2026
  • Score-Based Diffusion Models are generative models that learn the gradient of the log-density of progressively noised data, enabling effective reverse diffusion sampling.
  • They extend methods like DDPMs and annealed Langevin dynamics to achieve state-of-the-art performance in image, audio, inverse imaging, and molecular sampling.
  • The framework leverages SDE/ODE formulations, score matching, and advanced calculus techniques to provide robust theoretical guarantees and scalable algorithms.

Score-Based Diffusion Models (SBDMs) are a framework for generative modeling in which one learns the score—the gradient of the log-density—of progressively noised versions of a data distribution, and then synthesizes new samples by approximately reversing this diffusion process via stochastic (SDE) or deterministic (ODE) dynamics. This approach encompasses and extends denoising diffusion probabilistic models (DDPMs), annealed Langevin dynamics, and recent diffusion normalizing flows. SBDMs have recently achieved state-of-the-art performance across a wide range of tasks, including image, audio, conditional generation, inverse imaging, Bayesian inference, molecular sampling, and high-dimensional function-space modeling (Tang et al., 2024, Song et al., 2021, Mirafzali et al., 21 Mar 2025, Lim et al., 2023, Mirafzali et al., 27 Aug 2025, Hagemann et al., 2023).

1. Mathematical Formulation and Key Components

The SBDM framework is characterized by three central components:

  • Forward Diffusion SDE A (possibly infinite-dimensional) stochastic differential equation (SDE) that gradually transforms a data sample x0pdatax_0\sim p_{\text{data}} into pure noise:

dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}

where ff is a drift (often mean-reverting towards 0), gg is a diffusion coefficient, and wtw_t is standard Brownian motion. For function-valued data or solutions to PDEs, the process can be formulated as a linear SPDE on a Hilbert space, e.g., du(t)=Au(t)dt+Q1/2dWtdu(t) = A u(t)\,dt + Q^{1/2}\,dW_t with operator-theoretic diffusion (Mirafzali et al., 27 Aug 2025, Hagemann et al., 2023).

  • Score Function and Score Matching For each tt, the score xlogpt(x)\nabla_x \log p_t(x) of the noise-perturbed density pt(x)p_t(x) is approximated by a neural network (or operator network in infinite dimensions) sθ(x,t)s_\theta(x,t), trained to minimize the expected squared difference between dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}0 and the true score. DSM (denoising score matching) leverages access to the conditional transition dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}1, often Gaussian, yielding a tractable, closed-form training loss (Tang et al., 2024).
  • Reverse-Time SDE (and Probability-Flow ODE) The time-reversal of the forward SDE induces a drift involving the (unknown) score:

dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}2

The learned score dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}3 is substituted for dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}4 for sampling. The probability-flow ODE replaces the SDE by a deterministic flow:

dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}5

Exactly solving the ODE enables likelihood evaluation and deterministic mapping between latent and data spaces (Tang et al., 2024, Song et al., 2021).

2. Score Matching, Likelihood Training, and Sampling

The canonical training procedure minimizes a weighted Fisher divergence (score-matching loss), whose minimum coincides with the true score function: dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}6 Denoising score matching (DSM) replaces the inaccessible dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}7 with dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}8: dxt=f(xt,t)dt+g(t)dwt,x0pdatad x_t = f(x_t, t)\, dt + g(t)\, d w_t\,, \quad x_0 \sim p_{\text{data}}9 For Gaussian forward kernels, this reduces to a simple regression against known ff0-dependent targets.

Likelihood-based training is possible via the continuous normalizing flow view, using the probability-flow ODE and instantaneous change-of-variables. Choosing likelihood weighting ff1 ensures the SDE-based score-matching objective upper bounds the negative log-likelihood (NLL), yielding high-quality density estimators matching autoregressive models (Song et al., 2021).

Sampling employs discretized SDE or ODE solvers (Euler–Maruyama, predictor–corrector, Runge–Kutta). ODE-based sampling, or one-shot “consistency models”, can produce competitive sample quality in fewer steps, but may involve more complex training (Tang et al., 2024, Na et al., 2024).

3. Extensions: Function Space, Infinite Dimensions, and Operator-Valued Models

Recent work has rigorously extended SBDMs to function spaces and infinite-dimensional Hilbert spaces—essential for scientific computing, inverse problems, and modeling of PDE solutions (Lim et al., 2023, Hagemann et al., 2023, Mirafzali et al., 27 Aug 2025, Baker et al., 28 Jan 2026). Key advances include:

  • Infinite-Dimensional Forward Diffusion For ff2 in Hilbert space ff3, define an SPDE ff4 with ff5 trace-class, preserving spatially correlated (colored) noise and well-posedness in arbitrary dimensions (Mirafzali et al., 27 Aug 2025).
  • Closed-Form Infinite-Dimensional Score Via infinite-dimensional Malliavin calculus and Bismut–Elworthy–Li formulas, exact expressions for the Frechet derivative of the log-density are obtained, avoiding finite-dimensional projections:

ff6

with explicit formulas for the Malliavin covariance ff7 (Mirafzali et al., 27 Aug 2025, Mirafzali et al., 21 Mar 2025).

  • Operator-Valued Networks and Multilevel Training Approximation of the infinite-dimensional score is accomplished via Fourier Neural Operators or multilevel U-Net-style operator networks, structured for mesh-independent generalization (Hagemann et al., 2023). A telescopic training loss ensures convergence and adapts across spatial resolutions.
  • Posterior Conditioning and Guidance In Bayesian inverse problems, infinite-dimensional h-transform extensions (Doob's h-transform) enable conditioning SBDMs on observations. The conditional score decomposes as ff8, and simulation-free supervised guidance training recovers the guidance term for posterior sampling (Baker et al., 28 Jan 2026).

4. Practical Algorithms, Theoretical Guarantees, and Variants

SBDMs admit a spectrum of algorithmic and theoretical refinements:

  • Score Decomposition and Manifold Optimization Recent models decompose the score into normal (denoising) and tangent (content refinement) directions on reference manifolds, facilitating Pareto-efficient, multi-objective image-to-image translation (Sun et al., 2023).
  • Flexible Forward SDEs Beyond fixed SDEs, the forward process can be parameterized by a position-dependent Riemannian metric and Hamiltonian/symplectic drift, guaranteeing normalizable stationary laws and allowing for data-adaptive geometries (Du et al., 2022).
  • Dimension-Free Sample Complexity and Variance Reduction It is possible to learn a single score network across timesteps with nearly dimension-free generalization, proven via martingale error decompositions and variance-minimizing bootstrapped targets (Kumar et al., 14 Feb 2025). When the data lie near a ff9-dimensional manifold in gg0, careful scheduling and coefficient design enable discretization error bounds scaling with gg1 rather than gg2 (Li et al., 2024).
  • Malliavin Calculus for Score Computation: Analytical score formulas via Malliavin calculus coincide with the Fokker–Planck solution for linear SDEs and generalize to nonlinear, state-independent cases, lowering estimator variance in highly-nonlinear/multimodal settings (Mirafzali et al., 21 Mar 2025).
  • Reward-Directed and RL-Tuned Diffusion Treating score selection as a control policy allows reinforcement learning-based fine-tuning for reward maximization under entropy regularization. The optimal stochastic policy is always Gaussian, with closed-form mean and covariance, and practical estimation is achieved via actor-critic q-learning (Gao et al., 2024, Tang et al., 2024).
  • Posterior Inference and Inverse Problems SBDMs serve as powerful priors for Bayesian image reconstruction and general inverse problems. Inference combines SDE sampling with measurement-gradient conditioning, variational flows (DPI), or projection steps for data consistency (McCann et al., 2023, Feng et al., 2023, Chung et al., 2021).

5. Sampling Efficiency, Evaluation, and Empirical Results

Efficiency and effectiveness of SBDMs are advanced by several techniques:

  • Score Embedding and PDE-Based Pre-computation Solving the log-density Fokker–Planck equation numerically in advance and embedding the computed score into training accelerates convergence—reducing the number of epochs and data required for high-fidelity denoising (Na et al., 2024).
  • Importance Sampling for Boltzmann Distributions Post-training methods such as Variance-Tuned Diffusion Importance Sampling (VT-DIS) overcome bias in learned samplers via trajectory-wise reweighting, yielding unbiased estimates with high effective sample size at negligible test-time overhead (Zhang et al., 27 May 2025).
  • Ensemble Score Filters for SPDEs In data assimilation for SPDEs, ensemble-based score filters offer real-time, training-free posterior inference, competitive with (or exceeding) particle and Kalman-type filters under sparse and noisy observations (Huynh et al., 9 Aug 2025).

Benchmarks consistently indicate SBDMs achieve:

Task Metric (lower is better unless otherwise noted) SBDM Result Baseline
CIFAR-10 Gen. FID 2.83–3.13 2.90–2.95
ImageNet 32x32 Gen. NLL (bits/dim) 3.76 3.77–3.86
MNIST SDF Gen. FID (256x256) 21.9 23.9 (GANO)
MRI Recon. (fastMRI) PSNR (dB) 2–10 dB > TV U-Net, TV
Bayesian Inversion RMSE / ES / FID/PSNR/SSIM Best-in-class TV, RealNVP

Significant speedups (3–10x) have been reported when using score-embedding and functional operator approaches, especially in high-resolution settings (Na et al., 2024, Hagemann et al., 2023, Lim et al., 2023).

6. Theoretical Insights and Limitations

Theoretical advances underpin SBDMs across function spaces, dimensions, and conditioning:

7. Outlook and Future Directions

SBDM research is rapidly evolving:

Score-Based Diffusion Models therefore constitute a mathematically rigorous, highly-flexible, and empirically robust class of generative models, subsuming and advancing traditional score matching, normalizing flows, and denoising diffusions, with ongoing advances in theory, algorithmics, and applications.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Score-Based Diffusion Models (SBDMs).