Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Noise in the reverse process improves the approximation capabilities of diffusion models (2312.07851v2)

Published 13 Dec 2023 in cs.LG, cs.SY, eess.SY, and math.OC

Abstract: In Score based Generative Modeling (SGMs), the state-of-the-art in generative modeling, stochastic reverse processes are known to perform better than their deterministic counterparts. This paper delves into the heart of this phenomenon, comparing neural ordinary differential equations (ODEs) and neural stochastic differential equations (SDEs) as reverse processes. We use a control theoretic perspective by posing the approximation of the reverse process as a trajectory tracking problem. We analyze the ability of neural SDEs to approximate trajectories of the Fokker-Planck equation, revealing the advantages of stochasticity. First, neural SDEs exhibit a powerful regularizing effect, enabling $L2$ norm trajectory approximation surpassing the Wasserstein metric approximation achieved by neural ODEs under similar conditions, even when the reference vector field or score function is not Lipschitz. Applying this result, we establish the class of distributions that can be sampled using score matching in SGMs, relaxing the Lipschitz requirement on the gradient of the data distribution in existing literature. Second, we show that this approximation property is preserved when network width is limited to the input dimension of the network. In this limited width case, the weights act as control inputs, framing our analysis as a controllability problem for neural SDEs in probability density space. This sheds light on how noise helps to steer the system towards the desired solution and illuminates the empirical success of stochasticity in generative modeling.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. M Bertoldi and S Fornaro. Gradient estimates in parabolic problems with unbounded coefficients. Studia Mathematica, 3(165):221–254, 2004.
  2. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions. In International Conference on Learning Representations, 2023.
  3. Michel Chipot. Elliptic equations: an introductory course. Springer Science & Business Media, 2009.
  4. Score-based diffusion models for accelerated mri. Medical image analysis, 80:102479, 2022.
  5. George Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4):303–314, 1989.
  6. Valentin De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis. Transactions on Machine Learning Research, 2022.
  7. Bilinear controllability of a class of advection–diffusion–reaction systems. IEEE Transactions on Automatic Control, 64(6):2282–2297, 2018.
  8. Neural ode control for trajectory approximation of continuity equation. IEEE Control Systems Letters, 6:3152–3157, 2022.
  9. Lawrence C Evans. Partial differential equations, volume 19. American Mathematical Society, 2022.
  10. On choosing and bounding probability metrics. International statistical review, 70(3):419–435, 2002.
  11. Pierre Grisvard. Elliptic problems in nonsmooth domains. SIAM, 2011.
  12. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  13. Giovanni Leoni. A first course in Sobolev spaces. American Mathematical Soc., 2017.
  14. Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural networks, 6(6):861–867, 1993.
  15. Jürgen Moser. On the volume elements on a manifold. Transactions of the American Mathematical Society, 120(2):286–294, 1965.
  16. Moser flow: Divergence-based generative modeling on manifolds. Advances in Neural Information Processing Systems, 34:17669–17680, 2021.
  17. Control of neural transport for normalising flows. Journal de Mathématiques Pures et Appliquées, 2023a.
  18. Neural ode control for classification, approximation, and transport. SIAM Review, 65(3):735–773, 2023b.
  19. Jacques Simon. Compact sets in the space l p (o, t; b). Annali di Matematica pura ed applicata, 146:65–96, 1986.
  20. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2020.
  21. Universal approximation power of deep residual neural networks through the lens of control. IEEE Transactions on Automatic Control, 2022.
  22. Theoretical guarantees for sampling and inference in generative models with latent diffusions. In Conference on Learning Theory, pages 3084–3114. PMLR, 2019.
  23. Joseph Wloka. Partial differential equations. Cambridge University Press, 1987.
  24. Diffusion models: A comprehensive survey of methods and applications. ACM Computing Surveys, 2022.
  25. 3d shape generation and completion through point-voxel diffusion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5826–5835, 2021.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com