Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Closing the ODE-SDE gap in score-based diffusion models through the Fokker-Planck equation (2311.15996v1)

Published 27 Nov 2023 in cs.LG, cs.NA, math.NA, and stat.ML

Abstract: Score-based diffusion models have emerged as one of the most promising frameworks for deep generative modelling, due to their state-of-the art performance in many generation tasks while relying on mathematical foundations such as stochastic differential equations (SDEs) and ordinary differential equations (ODEs). Empirically, it has been reported that ODE based samples are inferior to SDE based samples. In this paper we rigorously describe the range of dynamics and approximations that arise when training score-based diffusion models, including the true SDE dynamics, the neural approximations, the various approximate particle dynamics that result, as well as their associated Fokker--Planck equations and the neural network approximations of these Fokker--Planck equations. We systematically analyse the difference between the ODE and SDE dynamics of score-based diffusion models, and link it to an associated Fokker--Planck equation. We derive a theoretical upper bound on the Wasserstein 2-distance between the ODE- and SDE-induced distributions in terms of a Fokker--Planck residual. We also show numerically that conventional score-based diffusion models can exhibit significant differences between ODE- and SDE-induced distributions which we demonstrate using explicit comparisons. Moreover, we show numerically that reducing the Fokker--Planck residual by adding it as an additional regularisation term leads to closing the gap between ODE- and SDE-induced distributions. Our experiments suggest that this regularisation can improve the distribution generated by the ODE, however that this can come at the cost of degraded SDE sample quality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. “Towards Principled Methods for Training Generative Adversarial Networks” In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017
  2. Martin Arjovsky, Soumith Chintala and Léon Bottou “Wasserstein Generative Adversarial Networks” In International conference on machine learning, 2017, pp. 214–223 PMLR
  3. Brian D.O. Anderson “Reverse-time diffusion equation models” In Stochastic Processes and their Applications 12.3, 1982, pp. 313–326
  4. Prafulla Dhariwal and Alexander Quinn Nichol “Diffusion Models Beat GANs on Image Synthesis” In Advances in Neural Information Processing Systems, 2021
  5. Jonathan Ho, Ajay Jain and Pieter Abbeel “Denoising diffusion probabilistic models” In Advances in Neural Information Processing Systems 33, 2020, pp. 6840–6851
  6. Aapo Hyvärinen “Estimation of Non-Normalized Statistical Models by Score Matching” In Journal of Machine Learning Research 6.24, 2005, pp. 695–709
  7. “Variational Diffusion Models” In Advances in Neural Information Processing Systems 34 Curran Associates, Inc., 2021, pp. 21696–21707
  8. “FP-Diffusion: Improving Score-Based Diffusion Models by Enforcing the Underlying Score Fokker-Planck Equation” In Proceedings of the 40th International Conference on Machine Learning, ICML’23 Honolulu, Hawaii, USA: JMLR.org, 2023
  9. “Maximum Likelihood Training for Score-Based Diffusion ODEs by High-Order Denoising Score Matching” In International Conference on Machine Learning, 2022, pp. 14429–14460 PMLR
  10. “Hierarchical Text-Conditional Image Generation with CLIP Latents”, 2022 arXiv:2204.06125 [cs.CV]
  11. “High-resolution image synthesis with latent diffusion models” In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 10684–10695
  12. “Generative Modeling by Estimating Gradients of the Data Distribution” In Proceedings of the 33rd International Conference on Neural Information Processing Systems Red Hook, NY, USA: Curran Associates Inc., 2019
  13. “Should EBMs model the energy or the score?” In Energy Based Models Workshop - ICLR 2021, 2021
  14. “Deep Unsupervised Learning using Nonequilibrium Thermodynamics” In Proceedings of the 32nd International Conference on Machine Learning 37, Proceedings of Machine Learning Research Lille, France: PMLR, 2015, pp. 2256–2265
  15. “Maximum likelihood training of score-based diffusion models” In Advances in Neural Information Processing Systems 34, 2021, pp. 1415–1428
  16. “Score-Based Generative Modeling through Stochastic Differential Equations” In International Conference on Learning Representations, 2021
  17. Pascal Vincent “A Connection Between Score Matching and Denoising Autoencoders” In Neural Computation 23.7, 2011, pp. 1661–1674
  18. Z. Wu, J. Yin and C. Wang “Elliptic And Parabolic Equations” World Scientific Publishing Company, 2006
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Teo Deveney (6 papers)
  2. Jan Stanczuk (8 papers)
  3. Lisa Maria Kreusser (22 papers)
  4. Chris Budd (10 papers)
  5. Carola-Bibiane Schönlieb (276 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.