Score-Based Diffusion Generative Models

Updated 9 July 2025

Score-based diffusion generative models are advanced deep learning methods that compute log-density gradients from noise-perturbed distributions using SDE dynamics.
They leverage Malliavin calculus and stochastic analysis to derive a closed-form score function, enabling efficient reverse-time simulation for data synthesis.
Their theoretical rigor facilitates accurate and computationally tractable modeling of complex systems, with applications in physics, finance, and data assimilation.

Score-based diffusion generative models constitute a prominent class of modern deep generative models that synthesize complex data by learning and utilizing the gradient of the log probability density (the "score") associated with intermediate, noise-perturbed distributions. These models employ stochastic differential equations (SDEs) to define forward and reverse dynamics, combining advances in stochastic analysis, variational inference, and numerical methods. Their recent theoretical enhancements include rigorous mathematical characterizations, generalizations to broader classes of stochastic processes, and improved estimation strategies for the score function itself.

1. Theoretical Foundations: Malliavin Calculus and Score Representation

Recent work has developed a mathematically rigorous framework for deriving the exact score function, $\nabla \log p_t(x)$ , for solutions of general nonlinear SDEs, using tools from Malliavin calculus and stochastic analysis (Mirafzali et al., 8 Jul 2025, Mirafzali et al., 21 Mar 2025). Consider a forward process: $dX_t = b(t, X_t) dt + \sigma(t, X_t) dB_t, \quad X_0 = x,$ where $b$ is the drift function, $\sigma$ is the diffusion coefficient, and $B_t$ is a standard Brownian motion.

Using Malliavin calculus, one introduces the concept of the Malliavin derivative $D_t X_T$ , which assesses the sensitivity of the terminal state to perturbations in the driving noise at time $t$ . The associated Malliavin covariance matrix is

$\gamma_{x_T} = \int_0^T \left(D_t X_T\right) \left(D_t X_T\right)^\top \, dt.$

Under mild regularity (smoothness and invertibility of $\gamma_{x_T}$ ), one can analytically represent the score function in terms of solution paths and their variations. A key result is that for each coordinate $k$ , the score is given by

$\frac{\partial}{\partial y_k} \log p_T(y) = -\mathbb{E}[\delta(u_k) \mid X_T = y],$

where $u_k$ is a specially constructed "covering vector field" involving the Malliavin derivative, and $\delta(\cdot)$ is the Skorokhod (divergence) integral, the adjoint to the Malliavin derivative.

This approach facilitates the representation of the score as a conditional expectation involving only the first and second variation processes—quantities that can be computed by differentiating the SDE flow with respect to its initial conditions. For linear SDEs, the closed form

$\nabla \log p(y) = -\gamma_{x_T}^{-1} (y - Y_T \mathbb{E}[X_0 | X_T = y])$

matches the classical result obtained from the Fokker–Planck equation.

2. Score Function Expression and Sampling Algorithms

The ability to express the score function exactly in terms of variation processes opens new possibilities for the design of generative diffusion models. The main theorem (Mirafzali et al., 8 Jul 2025) provides a decomposition of the score in the general (nonlinear, possibly state-dependent diffusion) case: $\text{score}_k(y) = -\mathbb{E}\left[ \int_0^T u_t(x) \cdot dB_t - \int_0^T \sum_j [Y_t^{-1} \sigma(t, X_t)]_j \cdot \big(A_{jk}(t) - B_{jk}(t) + C_{jk}(t)\big) dt \mid X_T = y \right],$ where all terms are constructed from the first and second variation processes $Y_t$ and $Z_t$ , the drift and diffusion functions, and associated deterministic integrals.

The critical practical implication is that all abstract Malliavin derivatives are eliminated from the final formula; the computation requires only simulation of the original SDE and its variations. In generative modeling, this representation supports the direct simulation of the reverse-time SDE (or ODE) for data synthesis, bypassing the need for approximating the score via neural network regression alone.

3. Comparison with Established Score Estimation Methods

Traditionally, diffusion generative models estimate the score using methods such as denoising score matching, sliced score matching, or approaches based on Schrödinger bridges:

Denoising score matching: The model is trained to regress the score by matching noisy data points to their clean antecedents, akin to a specific conditional expectation.
Sliced score matching: Reduces computational burden by restricting score estimation to one-dimensional projections.
Schrödinger bridges: Formulate the diffusion as a stochastic optimal transport problem, using optimal flows between prior and target distributions.

The Malliavin calculus framework provides a closed-form, theoretically grounded alternative, yielding potentially higher accuracy and stability, as the score is computed directly, not estimated via sampled losses.

4. Practical Impact and Application Scenarios

The new approach supports the simulation of generative models for a wider class of stochastic systems, including nonlinear dynamics and complex state-dependent diffusions. In practice, during training, one simulates the forward SDE alongside its first ( $Y_t$ ) and second ( $Z_t$ ) variation processes, computes the score using the closed-form formula, and uses it to drive the reverse-time SDE for sample synthesis.

This scheme enables modelers to bypass training complicated neural score approximators when sufficient smoothness holds and simulating the variation processes is tractable. The method also provides a pathway to error analysis and sensitivity quantification, as the overall approximation quality is tied directly to the numerical accuracy of the forward (and variation) simulations.

However, the applicability is subject to the invertibility of the Malliavin covariance (which may fail in degenerate regions of state space) and the tractability of computing variation processes in high-dimensional systems.

5. Extensions to Broader SDE Classes

Because the derivations are based on general probabilistic and analytical features of SDEs, the framework extends to systems with:

State-dependent diffusion coefficients;
More complex (including non-Markovian or infinite-dimensional) noise structures;
Stochastic partial differential equations in function space.

This generality suggests immediate applicability to scenarios in physics, finance, and complex data domains where classical score matching is intractable or less principled. For more general SDEs (including those driven by fractional Brownian motion, as noted in (Mirafzali et al., 21 Mar 2025)), further modifications to the variation process expressions are required, but the underlying integration-by-parts and adjoint machinery remains applicable.

6. Comparative Evaluation and Limitations

Empirical evaluation, including experiments on tasks such as 2D Gaussian mixtures, checkerboards, and Swiss roll datasets, demonstrates that the Malliavin-based approach produces generative samples of competitive quality with state-of-the-art methods (Mirafzali et al., 21 Mar 2025). On test datasets, key distributional metrics—such as Maximum Mean Discrepancy (MMD) and Wasserstein distance—were comparable or superior to traditional DDPM-based methods, despite using modest neural architectures for components such as estimating $\mathbb{E}[X_0|X_T = y]$ .

A notable limitation is the technical requirement for strong smoothness of the SDE coefficients and invertibility of the Malliavin matrix, which may limit direct practicality in very high-dimensional or stiff systems. Additionally, although the representation is closed form, simulation of high-dimensional variation processes can introduce computational overhead and numerical errors.

7. Outlook and Broader Implications

The Malliavin calculus-based approach provides an exact, closed-form route to score computation for a wide class of nonlinear diffusions, obviating the need for learned approximators in many cases. This advance enables the principled design of samplers, enhanced control of the error structure via variance analysis, and transparent extensions to previously inaccessible dynamical systems.

Future research may focus on addressing the challenges of high-dimensional simulation, relaxing smoothness and invertibility requirements, and deploying these methods in domains such as stochastic control, physical simulation, and advanced data assimilation, leveraging the flexibility of the theoretical framework to broaden the reach and accuracy of score-based generative diffusion models.

PDF Markdown Chat (Upgrade)

References (2)

1.

A Malliavin calculus approach to score functions in diffusion generative models (2025)

2.

Malliavin Calculus for Score-based Diffusion Models (2025)