Spacetime Geometry of Denoising in Diffusion Models

Updated 21 August 2025

Spacetime geometry of denoising is a framework that views the continuum of noisy data as a statistical manifold with time as a coordinate and Fisher-Rao metrics defining its structure.
It enables the computation of optimal denoising paths (geodesics) that support smooth interpolations and efficient transition sampling in high-dimensional data, such as images and molecular systems.
Leveraging the exponential family structure, the method achieves tractable metric and geodesic computations, bridging diffusion models with concepts from statistical mechanics and geometric analysis.

The spacetime geometry of denoising investigates the structure and dynamics of noisy-to-clean data transformations in diffusion models through the formalism of information geometry. By interpreting the continuum of noisy data—parametrized by the noise level—as constituting a statistical manifold endowed with a Fisher-Rao metric, this framework provides a natural geometric perspective on denoising, enabling computation of geodesics (i.e., optimal denoising paths) and deepening the connection between statistical mechanics, machine learning, and geometric analysis.

1. Statistical Manifold and Spacetime Structure

In the diffusion paradigm, a clean sample $x_0$ is gradually corrupted by noise to produce $x_t$ via a forward stochastic process. Rather than considering only the initial or final state, the approach systematically views all noisy states $\{x_t\}_{t \in (0, T]}$ —indexed by the noise (or diffusion) time $t$ —as coordinates in an extended latent space. For each $(x_t, t)$ , one has an explicit conditional denoising distribution $p(x_0 \mid x_t)$ generated by the trained diffusion model.

This family $\mathcal{M} = \{p(x_0 \mid x_t), \; t \in (0,T]\}$ forms a statistical manifold. With $t$ serving as a temporal coordinate and $x_t$ spanning a $D$ -dimensional spatial slice, $\mathcal{M}$ is a $(D+1)$ -dimensional manifold, here termed spacetime (Editor's term) in this context. The geometric viewpoint treats $t$ as an intrinsic "time" evolution parameter on the manifold, analogous to proper time in general relativity but with fundamentally statistical underpinnings.

2. Fisher-Rao Metric and Geodesic Computation

The canonical Riemannian structure on the statistical manifold is given by the Fisher-Rao metric:

$\mathcal{I}(\theta) = \mathbb{E}_{x_0 \sim p(\cdot|\theta)}\left[\nabla_\theta \log p(x_0|\theta) \left(\nabla_\theta \log p(x_0|\theta)\right)^\top\right],$

where the parameter $\theta = (x_t, t)$ . This metric quantifies the local sensitivity of the denoising distribution with respect to infinitesimal changes in the noisy input and noise level, reducing to the second derivative of the Kullback-Leibler divergence between probability distributions $p(\cdot|\theta)$ at nearby points.

Given this metric, one can define geodesics $\gamma(s)$ , $s \in [0,1]$ , as the shortest paths connecting two given noisy points on the manifold:

$\ell(\gamma) = \int_0^1 \sqrt{\dot{\gamma}(s)^\top \mathcal{I}(\gamma(s)) \dot{\gamma}(s)}\, ds.$

These geodesics provide principled definitions of interpolation and optimal denoising transitions, generalizing the notion of a straight line in data or latent space to the natural geometry induced by the model's probability structure.

Notably, for distributions in the exponential family (see below), the geodesic energy simplifies to

$\mathcal{E}(\gamma) = \frac{1}{2}\int_0^1 \Big\Vert\frac{d}{ds} \mu(\gamma(s))\Big\Vert^2 ds,$

where $\mu(\gamma(s))$ are the expectation parameters, allowing computationally tractable geodesic estimation even in high-dimensional spaces.

3. Exponential Family Structure of Denoising Distributions

The denoising posteriors in standard diffusion models admit an explicit exponential family form:

$p(x_0 \mid x_t) = h(x_0) \exp\big( \eta(x_t, t)^\top T(x_0) - \psi(x_t, t) \big),$

where $T(x_0)$ are sufficient statistics (e.g., $x_0$ , $\|x_0\|^2$ for Gaussian noise), $\eta(x_t, t)$ encodes the natural parameters (completely determined by prescribed diffusion schedules such as $\alpha_t, \sigma_t$ ), and $\psi(x_t, t)$ is the log-partition function.

This algebraic structure is crucial: it allows for closed-form or efficiently computable expressions for the Fisher-Rao metric and geodesics using only the derivatives of the mean and covariance, and supports scalable O( $D$ ) complexity via automatic differentiation tools. The Fisher information matrix then reduces to

$\mathcal{I}(x_t, t) = \Big(\frac{\partial \mu}{\partial (x_t, t)}\Big)^\top \Big(\frac{\partial \mu}{\partial (x_t, t)}\Big)$

with $\mu(x_t, t)= \mathbb{E}_{x_0 \sim p(\cdot|x_t)}[T(x_0)]$ , further simplifying practical computations.

4. Algorithmic and Practical Implications

The geometric formalism enables algorithmic advances for path-based queries in diffusion models:

Image/Signal Interpolation: Geodesics on the spacetime manifold between noisy encodings of two clean samples yield continuous, semantically smooth interpolation sequences that, after decoding (solving the probability flow ODE), traverse meaningful image or data trajectories. Such interpolations minimize information loss compared to standard methods that detour through noise extremes.
Transition Path Sampling in Molecular Systems: For molecular data, where $p(x_0)$ may be a Boltzmann distribution describing metastable structures, spacetime geodesics correspond to sequences of intermediate Boltzmann distributions. Sampling along these geodesics produces physically plausible, low-energy transition pathways for molecular rearrangements.
Efficient Geodesic Optimization: By re-expressing the geodesic problem in terms of expectation parameter evolution, one avoids costly repeated decoder or SDE evaluations. The code implementation relies on a small number of neural network denoiser invocations and Jacobian-vector products (efficiently handled via the Hutchinson estimator), supporting scalable application to high-dimensional problems without retraining.
Constraint Handling: The geometric approach supports optimization under additional constraints, such as variance penalization or respecting forbidden regions, by modifying the energy functional accordingly.

5. Conceptual Significance and Connection to Broader Theories

This information-geometric perspective unifies disparate concepts:

By interpreting noise level as an intrinsic "temporal" coordinate, the induced spacetime geometry echoes ideas from physics where the manifold of probability distributions replaces conventional configuration space, and geodesics correspond to optimal inference or denoising paths.
The approach is compatible with perspectives from quantum information geometry, where Fisher information plays a role analogous to curvature or gravitational dynamics (cf. emergent AdS geometry from entanglement spectra (Matsueda, 2014), and statistical metrics in geometrodynamics (Caticha, 2019)).
The exponential family structure of the underlying distributions is leveraged not only computationally but conceptually, linking to large deviation theory, thermodynamic geometry, and renormalization group flows in the paper of emergent spacetime (Rastgoo et al., 2016).
The methodology aligns with the broader trend of leveraging geometry and optimal transport in generative models, but the explicit statistical manifold structure and tractable geodesic solutions constitute a distinguishing feature.

6. Available Software and Reproducibility

The methodology is supported by open-source code (https://github.com/Aalto-QuML/diffusion-spacetime-geometry) implementing these principles. The toolkit:

Directly computes Fisher-Rao energies and geodesics for standard diffusion models.
Provides routines for path sampling (image, molecular), Jacobian approximations, and optimized interpolation.
Requires only a trained denoiser; no decoder retraining or architectural modification is needed.
Scales effectively to high-dimensional data settings, e.g., modern image or molecular generative models.

7. Outlook and Theoretical Relevance

The spacetime geometry of denoising opens several avenues:

Theoretical Understanding: Offers a rigorous, physically motivated, and computationally tractable lens to analyze and design diffusion models via their induced geometry, potentially informing advances in robustness, interpretability, and controllability of generative systems.
Extensions and Generalizations: The geometric approach generalizes to other classes of noise processes (e.g., non-Gaussian, continuous-time) and can inform the design of new generative procedures via specification of target manifold geometries.
Interdisciplinary Links: The geometric paradigm links statistical inference, machine learning, and fundamental physics, suggesting further connections to metric geometry, optimal transport, and quantum information theory.

This synthesis situates the spacetime geometry of denoising as a fundamental, unifying framework that concretely bridges diffusion generative modeling, information geometry, and the mathematical structure of statistical manifolds, offering both algorithmic benefits and conceptual depth (Karczewski et al., 23 May 2025).