Ambient-dimension independence of score-based diffusion models under the manifold hypothesis

Determine whether score-based generative models (denoising diffusion probabilistic models) can, under the manifold hypothesis, achieve estimators whose Wasserstein distance to the target distribution is independent of the ambient dimension D, i.e., whether dimension-independent performance analogous to optimal manifold-based estimators is feasible for diffusion models.

Background

The manifold hypothesis posits that high-dimensional data often lie on low-dimensional manifolds within the ambient space. Prior work (Divol, 2022) showed that, under this hypothesis, one can construct estimators of the data distribution whose Wasserstein distance to the true distribution is independent of the ambient dimension D.

In contrast, diffusion models (denoising diffusion probabilistic models) are defined in the ambient space, and prevailing theoretical bounds had strong D-dependence. The paper highlights that before their contribution, it remained an open question whether diffusion models could achieve the same dimension-independent behavior under the manifold hypothesis.

References

The manifold hypothesis is particularly relevant as a way to understand the behavior of statistical or learning algorithms in high dimensions; see for instance \textcite{divol2022measure} who showed that under the manifold hypothesis, it is possible to construct estimators of $ \mu$ whose Wasserstein distance to $ \mu$ is independent of the ambient dimension $D$. Whether this is feasible in the context of score-matching diffusion models has been an open question so far.

Convergence of Diffusion Models Under the Manifold Hypothesis in High-Dimensions (2409.18804 - Azangulov et al., 27 Sep 2024) in Subsection "Manifold Hypothesis" (Section 2), labeled \label{sec:Statistical_Model_for_Manifolds}