Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
8 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generative Diffusion From An Action Principle (2310.04490v1)

Published 6 Oct 2023 in cs.LG and physics.class-ph

Abstract: Generative diffusion models synthesize new samples by reversing a diffusive process that converts a given data set to generic noise. This is accomplished by training a neural network to match the gradient of the log of the probability distribution of a given data set, also called the score. By casting reverse diffusion as an optimal control problem, we show that score matching can be derived from an action principle, like the ones commonly used in physics. We use this insight to demonstrate the connection between different classes of diffusion models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. E. Schrödinger, Über die Umkehrung der Naturgesetze. Sitzungsberichte der Preussischen Akademie der Wissenschaften. Physikalisch-mathematische Klasse. Verlag der Akademie der Wissenschaften in Kommission bei Walter De Gruyter u. Company, 1931.
  2. R. Chetrite, P. Muratore-Ginanneschi, and K. Schwieger, “E. Schrödinger’s 1931 paper “On the Reversal of the Laws of Nature” [“Über die Umkehrung der Naturgesetze”, Sitzungsberichte der preussischen Akademie der Wissenschaften, physikalisch-mathematische Klasse, 8 N9 144–153],” Eur. Phys. J. H 46 no. 1, (2021) 28, arXiv:2105.12617 [physics.hist-ph].
  3. J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” in Proceedings of the 32nd International Conference on Machine Learning, F. Bach and D. Blei, eds., vol. 37 of Proceedings of Machine Learning Research, pp. 2256–2265. PMLR, Lille, France, 07–09 jul, 2015. https://proceedings.mlr.press/v37/sohl-dickstein15.html.
  4. A. Goyal, N. R. Ke, S. Ganguli, and Y. Bengio, “Variational walkback: Learning a transition operator as a stochastic recurrent net,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds., vol. 30. Curran Associates, Inc., 2017. https://proceedings.neurips.cc/paper_files/paper/2017/file/46a558d97954d0692411c861cf78ef79-Paper.pdf.
  5. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” CoRR abs/2006.11239 (2020) , 2006.11239. https://arxiv.org/abs/2006.11239.
  6. Y. Song and S. Ermon, “Generative modeling by estimating gradients of the data distribution,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, eds., vol. 32. Curran Associates, Inc., 2019. https://proceedings.neurips.cc/paper_files/paper/2019/file/3001ef257407d5a371a96dcd947c7d93-Paper.pdf.
  7. Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” in 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. https://openreview.net/forum?id=PxTIG12RRHS.
  8. C.-W. Huang, J. H. Lim, and A. C. Courville, “A variational perspective on diffusion-based generative models and score matching,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, eds., vol. 34, pp. 22863–22876. Curran Associates, Inc., 2021. https://proceedings.neurips.cc/paper_files/paper/2021/file/c11abfd29e4d9b4d4b566b01114d8486-Paper.pdf.
  9. J. Berner, L. Richter, and K. Ullrich, “An optimal control perspective on diffusion-based generative modeling,” 2023.
  10. M. Pavon, “On local entropy, stochastic control and deep neural networks,” 2022.
  11. V. De Bortoli, J. Thornton, J. Heng, and A. Doucet, “Diffusion schrödinger bridge with applications to score-based generative modeling,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, eds., vol. 34, pp. 17695–17709. Curran Associates, Inc., 2021. https://proceedings.neurips.cc/paper_files/paper/2021/file/940392f5f32a7ade1cc201767cf83e31-Paper.pdf.
  12. F. Vargas, P. Thodoroff, A. Lamacraft, and N. Lawrence, “Solving schrödinger bridges via maximum likelihood,” Entropy 23 no. 9, (2021) . https://www.mdpi.com/1099-4300/23/9/1134.
  13. G. Wang, Y. Jiao, Q. Xu, Y. Wang, and C. Yang, “Deep generative learning via schrödinger bridge,” in Proceedings of the 38th International Conference on Machine Learning, M. Meila and T. Zhang, eds., vol. 139 of Proceedings of Machine Learning Research, pp. 10794–10804. PMLR, 18–24 jul, 2021. https://proceedings.mlr.press/v139/wang21l.html.
  14. L. Winkler, C. Ojeda, and M. Opper, “A score-based approach for training schrödinger bridges for data modelling,” Entropy 25 no. 2, (2023) . https://www.mdpi.com/1099-4300/25/2/316.
  15. P. Vincent, “A connection between score matching and denoising autoencoders,” Neural Computation 23 no. 7, (2011) 1661–1674.
  16. A. Einstein, Investigations on the Theory of the Brownian Movement. Courier Corporation, 1956.
  17. Springer-Verlag, Berlin, third ed., 2004.
  18. J. Zinn-Justin, Quantum Field Theory and Critical Phenomena; 4th ed. International series of monographs on physics. Clarendon Press, Oxford, 2002. https://cds.cern.ch/record/572813.
  19. Z. Schuss, Theory and Applications of Stochastic Processes. Applied Mathematical Sciences. Springer New York, NY, 1 ed.
  20. G. E. Uhlenbeck and L. S. Ornstein, “On the theory of the brownian motion,” Phys. Rev. 36 (Sep, 1930) 823–841. https://link.aps.org/doi/10.1103/PhysRev.36.823.
  21. H. Risken and H. Haken, The Fokker-Planck Equation: Methods of Solution and Applications Second Edition. Springer, 1989.
  22. T. Cohen, D. Green, and A. Premkumar, “Large Deviations in the Early Universe,” Phys. Rev. D 107 no. 8, (2023) 083501, arXiv:2212.02535 [hep-th].
  23. H. Touchette, “The large deviation approach to statistical mechanics,” Physics Reports 478 no. 1-3, (Jul, 2009) 1–69. https://doi.org/10.1016/j.physrep.2009.05.002.
  24. Birkhäuser Boston, Boston, MA, 1990. https://doi.org/10.1007/978-1-4612-3462-3_55.
  25. R. Aebi, Schrödinger Diffusion Processes. Probability and Its Applications. Birkhäuser Basel, 1 ed., 1996.
  26. E. Nelson, “Derivation of the schrödinger equation from newtonian mechanics,” Phys. Rev. 150 (Oct, 1966) 1079–1085. https://link.aps.org/doi/10.1103/PhysRev.150.1079.
  27. B. D. Anderson, “Reverse-time diffusion equation models,” Stochastic Processes and their Applications 12 no. 3, (1982) 313–326. https://www.sciencedirect.com/science/article/pii/0304414982900515.
  28. U. G. Haussmann and E. Pardoux, “Time Reversal of Diffusions,” The Annals of Probability 14 no. 4, (1986) 1188 – 1205. https://doi.org/10.1214/aop/1176992362.
  29. H. Föllmer, “Random fields and diffusion processes,” in École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–87, P.-L. Hennequin, ed., pp. 101–203. Springer Berlin Heidelberg, Berlin, Heidelberg, 1988.
  30. M. Pavon, “Stochastic control and nonequilibrium thermodynamical systems,” Applied Mathematics and Optimization 19 no. 1, (1989) 187–202. https://doi.org/10.1007/BF01448198.
  31. K. Yasue, “A simple derivation of the Onsager–Machlup formula for one‐dimensional nonlinear diffusion process,” Journal of Mathematical Physics 19 no. 8, (08, 2008) 1671–1673, https://pubs.aip.org/aip/jmp/article-pdf/19/8/1671/8149564/1671_1_online.pdf. https://doi.org/10.1063/1.523888.
  32. Y. Song, C. Durkan, I. Murray, and S. Ermon, “Maximum likelihood training of score-based diffusion models,” in Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, eds., vol. 34, pp. 1415–1428. Curran Associates, Inc., 2021. https://proceedings.neurips.cc/paper_files/paper/2021/file/0a9fdbb17feb6ccb7ec405cfb85222c4-Paper.pdf.
  33. A. Hyvärinen, “Estimation of non-normalized statistical models by score matching,” J. Mach. Learn. Res. 6 (Dec, 2005) 695–709.
  34. A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in Proceedings of the 38th International Conference on Machine Learning, M. Meila and T. Zhang, eds., vol. 139 of Proceedings of Machine Learning Research, pp. 8162–8171. PMLR, 18–24 jul, 2021. https://proceedings.mlr.press/v139/nichol21a.html.
  35. V. D. Bortoli, E. Mathieu, M. J. Hutchinson, J. Thornton, Y. W. Teh, and A. Doucet, “Riemannian score-based generative modeling,” CoRR abs/2202.02763 (2022) , 2202.02763. https://arxiv.org/abs/2202.02763.
  36. Y. Jagvaral, R. Mandelbaum, and F. Lanusse, “Modeling halo and central galaxy orientations on the so(3) manifold with score-based generative models,” 2022.
Citations (4)

Summary

We haven't generated a summary for this paper yet.