Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 35 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 190 tok/s Pro
GPT OSS 120B 438 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Score-Based Diffusion Models

Updated 11 October 2025
  • Score-based diffusion models are generative frameworks that transform structured data into noise and then reverse the process using an explicitly computed score function.
  • They directly model function-valued data in separable Hilbert spaces through stochastic partial differential equations without relying on finite-dimensional discretization.
  • Techniques from Malliavin calculus and operator theory yield a closed-form score function that preserves the geometry of the data and enhances generative stability and scalability.

Score-based diffusion models are a class of generative models that define a stochastic process to continuously transform structured data into noise and then generate new samples by simulating a learned reversal of this process. In the infinite-dimensional context, these models are constructed and analyzed within separable Hilbert spaces, enabling the modeling of function-valued data (e.g., solutions to partial differential equations, images seen as elements of L2L^2) and circumventing the limitations of finite-dimensional or discretized approaches. The core innovation is the explicit computation of the score function—the Fréchet derivative of the log-density—in infinite-dimensional settings, leveraging tools from Malliavin calculus and operator theory to maintain consistency with the underlying geometry of the data space, accommodate spatial correlations in noise, and connect with modern approaches in functional data analysis (Mirafzali et al., 27 Aug 2025).

1. Infinite-Dimensional Mathematical Framework

Score-based diffusion modeling in infinite dimensions is formulated on a separable Hilbert space HH, such as L2L^2 over a spatial domain. The forward process is modeled as a linear stochastic partial differential equation (SPDE):

du(t)=Au(t)dt+Q1/2dWt,u(0)=u0H,du(t) = A u(t) dt + Q^{1/2} dW_t, \quad u(0) = u_0 \in H,

where AA is a densely defined, unbounded (typically elliptic) operator generating a strongly continuous semigroup S(t)=etAS(t) = e^{tA}, Q1/2Q^{1/2} is a Hilbert–Schmidt operator encoding the spatial covariance of the (possibly colored) noise, and WtW_t is an HH-valued cylindrical Wiener process. The mild solution to this equation is

u(t)=S(t)u0+0tS(ts)Q1/2dWs.u(t) = S(t) u_0 + \int_0^t S(t-s) Q^{1/2} dW_s.

The Malliavin covariance operator is defined as

γu(t)=0tS(s)Q1/2(Q1/2)S(s)ds,\gamma_{u(t)} = \int_0^t S(s) Q^{1/2}(Q^{1/2})^* S(s)^* ds,

which plays a central role in the derivation of the infinite-dimensional score.

2. Rigorous Treatment of the Forward Diffusion Process

The use of a trace-class noise operator (QQ is trace class) ensures well-posedness and finite variance of the solution in the infinite-dimensional space HH. The stochastic convolution

0tS(ts)Q1/2dWs\int_0^t S(t-s) Q^{1/2} dW_s

is then well-defined in L2(Ω;H)L^2(\Omega; H) even in arbitrary spatial dimensions. The noise term generalizes white noise to colored (correlated) noise, accommodating physical systems with spatial structure. Importantly, the framework is constructed without invoking any finite-dimensional projection or discretization, in contrast to standard approaches that would rely on grid-based representations or truncation to NN-dimensional subspaces.

3. Malliavin Calculus and Explicit Score Formula

Malliavin calculus provides the machinery to compute the sensitivity of the solution u(t)u(t) with respect to perturbations in the driving noise. The explicit form of the Malliavin derivative is

Dru(t)=S(tr)Q1/21[0,t](r),D_r u(t) = S(t-r) Q^{1/2} \,\mathbf{1}_{[0,t]}(r),

which quantifies the impact of a “noise impulse” at time rr. The covariance operator γu(t)\gamma_{u(t)} aggregates this sensitivity over all r[0,t]r \in [0,t]. Through an infinite-dimensional generalization of the Bismut–Elworthy–Li formula, the score—the Fréchet derivative of the log-density—is given by

hlogpu(t)(u)=γu(t)(uS(t)u0),hH,\nabla_h \log p_{u(t)}(u) = -\langle \gamma_{u(t)}^\dagger (u - S(t) u_0), h \rangle_H,

where hh is any Cameron–Martin direction (i.e. in the effective range of γu(t)1/2\gamma_{u(t)}^{1/2}) and γu(t)\gamma_{u(t)}^\dagger is the Moore–Penrose pseudoinverse of the covariance operator. This analytic, closed-form formula does not require discretization or projection, and is intrinsic to the Hilbert space structure.

4. Operator-Theoretic Analysis and Spatially Correlated Noise

The analysis relies on operator-theoretic constructions:

  • The semigroup S(t)S(t), encoding the flow induced by the drift AA, serves as the first variation process, respecting the geometry of HH.
  • Bounded operators and their adjoints structure the progression and sensitivity of the stochastic process.
  • The framework accommodates trace-class and thus spatially correlated noise, and is not dependent on the invertibility of S(t)S(t).
  • When the semigroup is invertible, the covariance can alternatively be represented as

γu(t)=YtCtYt,Ct=0tYr1Q1/2(Q1/2)(Yr1)dr,\gamma_{u(t)} = Y_t C_t Y_t^*, \quad C_t = \int_0^t Y_r^{-1} Q^{1/2}(Q^{1/2})^*(Y_r^{-1})^* dr,

allowing further flexibility in treating the stochastic structure.

5. Connections to Functional Data Analysis and Learning Strategies

Score estimation in this infinite-dimensional setting is naturally linked to methodologies in functional data analysis:

  • The score function is framed as a regression operator acting on the space HH, enabling the application of kernel methods in reproducing kernel Hilbert spaces and neural operator architectures for efficient function-to-function learning.
  • This operator perspective allows the deployment of advanced computational tools for estimation and sampling in high-dimensional or function-valued data without ad hoc discretization, and aligns with recent developments in neural operator learning.

6. Implications for Generative Modeling and Applications

The framework enables generative modeling of infinite-dimensional random elements, such as function-valued data and solutions to SPDEs, directly in Hilbert space. Potential impacts include:

  • Rigorous algorithms for infinite-dimensional data synthesis, sampling, and denoising that respect the geometry and statistical properties of the underlying functional space.
  • The analytic form of the score function provides a principled basis for constructing reverse-time SDEs for generative modeling in function space, extending the reach of diffusion models to scientific computing domains, climate and materials simulation, and medical imaging, where the data is intrinsically infinite-dimensional or requires resolution independence.
  • By not requiring discretization, the approach improves theoretical guarantees, scalability, and stability, especially in applications where spatial correlations are significant.

7. Outlook and Future Research Directions

This framework lays a rigorous mathematical foundation for score-based diffusion modeling in infinite-dimensional settings, suggesting several avenues for future investigation:

  • Extending the methodology to nonlinear SPDEs and to systems with multiplicative (state-dependent) noise, broadening the class of physical and scientific processes that can be modeled.
  • Integrating advanced regularization and kernel-based methods from functional data analysis for improved score estimation and computational tractability.
  • Applying neural operator and kernel architectures to leverage the operator-theoretic structure, thus enabling efficient learning and inference in large-scale functional datasets.
  • Exploring the impact of the infinite-dimensional framework on the expressivity and generalization capacity of generative models in practical applications.

In summary, the operator-theoretic use of Malliavin calculus yields a closed-form, infinite-dimensional score formula that preserves the intrinsic geometry of Hilbert spaces, accommodates spatially correlated noise, and bridges generative modeling with modern functional analytic and computational techniques (Mirafzali et al., 27 Aug 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Score Based Diffusion Models.