Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 39 tok/s Pro
GPT-5 Medium 20 tok/s
GPT-5 High 16 tok/s Pro
GPT-4o 99 tok/s
GPT OSS 120B 476 tok/s Pro
Kimi K2 214 tok/s Pro
2000 character limit reached

Theoretical research on generative diffusion models: an overview (2404.09016v1)

Published 13 Apr 2024 in cs.LG, cs.AI, and cs.CV

Abstract: Generative diffusion models showed high success in many fields with a powerful theoretical background. They convert the data distribution to noise and remove the noise back to obtain a similar distribution. Many existing reviews focused on the specific application areas without concentrating on the research about the algorithm. Unlike them we investigated the theoretical developments of the generative diffusion models. These approaches mainly divide into two: training-based and sampling-based. Awakening to this allowed us a clear and understandable categorization for the researchers who will make new developments in the future.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (189)
  1. Fast inference in denoising diffusion models via mmd finetuning. arXiv preprint arXiv:2301.07969, 2023.
  2. Building normalizing flows with stochastic interpolants. arXiv preprint arXiv:2209.15571, 2022.
  3. Computer methods for ordinary differential equations and differential-algebraic equations, volume 61. Siam, 1998.
  4. Structured denoising diffusion models in discrete state-spaces. Advances in Neural Information Processing Systems, 34:17981–17993, 2021.
  5. Cold diffusion: Inverting arbitrary image transforms without noise. arXiv preprint arXiv:2208.09392, 2022.
  6. Estimating the optimal covariance with imperfect mean in diffusion probabilistic models. arXiv preprint arXiv:2206.07309, 2022a.
  7. Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. arXiv preprint arXiv:2201.06503, 2022b.
  8. All are worth words: A vit backbone for diffusion models, 2023.
  9. Non-uniform diffusion models, 2022.
  10. Dynamic dual-output diffusion models, 2022.
  11. Tract: Denoising diffusion models with transitive closure time-distillation, 2023.
  12. Demystifying mmd gans. arXiv preprint arXiv:1801.01401, 2018.
  13. Retrieval-augmented diffusion models. Advances in Neural Information Processing Systems, 35:15309–15324, 2022a.
  14. Semi-parametric neural image synthesis. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022b. URL https://openreview.net/forum?id=Bqk9c0wBNrZ.
  15. Multi-modal latent diffusion, 2023.
  16. A continuous time framework for discrete denoising models. arXiv preprint arXiv:2205.14987, 2022.
  17. Trans-dimensional generative modeling via jump diffusion models, 2023.
  18. A survey on generative diffusion model. arXiv preprint arXiv:2209.02646, 2022.
  19. Exploring the optimal choice for generative processes in diffusion models: Ordinary vs stochastic differential equations, 2023.
  20. On the design fundamentals of diffusion models: A survey, 2023.
  21. A geometric perspective on diffusion models, 2023a.
  22. Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions, 2023b.
  23. Likelihood training of schr\\\backslash\” odinger bridge using forward-backward sdes theory. arXiv preprint arXiv:2110.11291, 2021.
  24. Ting Chen. On the importance of noise scheduling for diffusion models, 2023.
  25. Analog bits: Generating discrete data using diffusion models with self-conditioning, 2023c.
  26. Re-imagen: Retrieval-augmented text-to-image generator, 2022.
  27. Towards enhanced controllability of diffusion models, 2023.
  28. Ilvr: Conditioning method for denoising diffusion probabilistic models, 2021.
  29. Perception prioritized training of diffusion models, 2022.
  30. Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12413–12422, 2022.
  31. Diffusion models in vision: A survey. arXiv preprint arXiv:2209.04747, 2022.
  32. Soft diffusion: Score matching for general corruptions. arXiv preprint arXiv:2209.05442, 2022.
  33. Ambient diffusion: Learning clean distributions from corrupted data, 2023.
  34. Simulating diffusion bridges with score matching. arXiv preprint arXiv:2111.07243, 2021.
  35. Riemannian score-based generative modeling. arXiv preprint arXiv:2202.02763, 2022.
  36. On analyzing generative and denoising capabilities of diffusion-based deep generative models, 2022.
  37. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
  38. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
  39. Nice: Non-linear independent components estimation. arXiv preprint arXiv:1410.8516, 2014.
  40. Score-based generative modeling with critically-damped langevin diffusion. arXiv preprint arXiv:2112.07068, 2021.
  41. Genie: Higher-order denoising diffusion solvers. arXiv preprint arXiv:2210.05475, 2022.
  42. Optimal linear subspace search: Learning to construct fast and high-quality schedulers for diffusion models. arXiv preprint arXiv:2305.14677, 2023.
  43. Structural pruning for diffusion models, 2023.
  44. How much is enough? a study on diffusion times in score-based generative models. arXiv preprint arXiv:2206.05173, 2022.
  45. Learning energy-based models by diffusion recovery likelihood. arXiv preprint arXiv:2012.08125, 2020.
  46. Few-shot diffusion models. arXiv preprint arXiv:2205.15463, 2022.
  47. Interpreting diffusion score matching using normalizing flow. arXiv preprint arXiv:2107.10072, 2021.
  48. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  49. Knowledge distillation: A survey. International Journal of Computer Vision, 129(6):1789–1819, 2021.
  50. Conditional generation from unconditional diffusion models using denoiser representations, 2023.
  51. Alex Graves. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850, 2013.
  52. Boot: Data-free distillation of denoising diffusion models with bootstrapping, 2023.
  53. Vector quantized diffusion model for text-to-image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10696–10706, 2022.
  54. Card: Classification and regression diffusion models. arXiv preprint arXiv:2206.07275, 2022.
  55. Iterative α𝛼\alphaitalic_α-blending: a minimalist deterministic diffusion model. arXiv preprint arXiv:2305.03486, 2023.
  56. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  57. Distilling the knowledge in a neural network, 2015.
  58. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
  59. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  60. Cascaded diffusion models for high fidelity image generation. J. Mach. Learn. Res., 23:47–1, 2022.
  61. Improving sample quality of diffusion models using self-attention guidance, 2023.
  62. Blurring diffusion models, 2022.
  63. Autoregressive diffusion models. arXiv preprint arXiv:2110.02037, 2021a.
  64. Argmax flows and multinomial diffusion: Learning categorical distributions. Advances in Neural Information Processing Systems, 34:12454–12465, 2021b.
  65. Self-guided diffusion models, 2023.
  66. A variational perspective on diffusion-based generative models and score matching. Advances in Neural Information Processing Systems, 34:22863–22876, 2021.
  67. Prodiff: Progressive fast diffusion model for high-quality text-to-speech. In Proceedings of the 30th ACM International Conference on Multimedia, pages 2595–2605, 2022.
  68. Michael F Hutchinson. A stochastic estimator of the trace of the influence matrix for laplacian smoothing splines. Communications in Statistics-Simulation and Computation, 18(3):1059–1076, 1989.
  69. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(4), 2005.
  70. Subspace diffusion generative models. arXiv preprint arXiv:2205.01490, 2022.
  71. Adversarial score matching and improved sampling for image generation. arXiv preprint arXiv:2009.05475, 2020.
  72. Gotta go fast when generating data with score-based models. arXiv preprint arXiv:2105.14080, 2021.
  73. Fiona Victoria Stanley Jothiraj and Afra Mashhadi. Phoenix: A federated generative diffusion model, 2023.
  74. Elucidating the design space of diffusion-based generative models. arXiv preprint arXiv:2206.00364, 2022.
  75. Denoising diffusion restoration models. arXiv preprint arXiv:2201.11793, 2022.
  76. Enhancing diffusion-based image synthesis with robust classifier guidance, 2023.
  77. Diffusion models for medical image analysis: A comprehensive survey. arXiv preprint arXiv:2211.07804, 2022.
  78. Understanding DDPM latent codes through optimal transport. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=6PIrhAx1j4i.
  79. Maximum likelihood training of parametrized diffusion model. 2021.
  80. Maximum likelihood training of implicit nonlinear diffusion models. arXiv preprint arXiv:2205.13699, 2022a.
  81. Soft truncation: A universal training technique of score-based diffusion model for high precision score estimation. In International Conference on Machine Learning, pages 11201–11228. PMLR, 2022b.
  82. Refining generative process with discriminator guidance in score-based diffusion models, 2023a.
  83. Consistency trajectory models: Learning probability flow ode trajectory of diffusion. arXiv preprint arXiv:2310.02279, 2023b.
  84. Variational diffusion models. Advances in neural information processing systems, 34:21696–21707, 2021.
  85. Understanding diffusion objectives as the elbo with simple data augmentation, 2023.
  86. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  87. On fast sampling of diffusion probabilistic models. arXiv preprint arXiv:2106.00132, 2021.
  88. Learning multiple layers of features from tiny images. 2009.
  89. Bddm: Bilateral denoising diffusion models for fast and high-quality speech synthesis. arXiv preprint arXiv:2203.13508, 2022.
  90. A tutorial on energy-based learning. Predicting structured data, 1(0), 2006.
  91. Minimizing trajectory curvature of ode-based generative models. arXiv preprint arXiv:2301.12003, 2023a.
  92. Multi-architecture multi-expert diffusion models, 2023b.
  93. Alleviating exposure bias in diffusion models through sampling with shifted time steps, 2023a.
  94. Scire-solver: Accelerating diffusion models sampling by score-integrand solver with recursive difference, 2023b.
  95. Diffusion models for image restoration and enhancement–a comprehensive survey. arXiv preprint arXiv:2308.09388, 2023c.
  96. Limitations of autoregressive models and their alternatives, 2021.
  97. Diffusion models for time series applications: A survey. arXiv preprint arXiv:2305.00624, 2023a.
  98. Common diffusion noise schedules and sample steps are flawed, 2023b.
  99. Flow matching for generative modeling. arXiv preprint arXiv:2210.02747, 2022.
  100. Pseudo numerical methods for diffusion models on manifolds. arXiv preprint arXiv:2202.09778, 2022a.
  101. Flow straight and fast: Learning to generate and transfer data with rectified flow. arXiv preprint arXiv:2209.03003, 2022b.
  102. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pages 3730–3738, 2015.
  103. Maximum likelihood training for score-based diffusion odes by high order denoising score matching. In International Conference on Machine Learning, pages 14429–14460. PMLR, 2022a.
  104. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. arXiv preprint arXiv:2206.00927, 2022b.
  105. Knowledge distillation in iterative generative models for improved sampling speed. arXiv preprint arXiv:2101.02388, 2021.
  106. Calvin Luo. Understanding diffusion models: A unified perspective, 2022.
  107. Boomerang: Local sampling on image manifolds using diffusion models. arXiv preprint arXiv:2210.12100, 2022.
  108. Accelerating diffusion models via early stop of the diffusion process. arXiv preprint arXiv:2205.12524, 2022.
  109. David McAllester. On the mathematics of diffusion models, 2023.
  110. Estimating high order gradients of the data distribution by denoising, 2021.
  111. On distillation of guided diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14297–14306, 2023.
  112. Nerf: Representing scenes as neural radiance fields for view synthesis, 2020.
  113. Denoising diffusion gamma models, 2021a.
  114. Non gaussian denoising diffusion models. arXiv preprint arXiv:2106.07582, 2021b.
  115. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
  116. Elucidating the exposure bias in diffusion models. arXiv preprint arXiv:2308.15321, 2023a.
  117. Input perturbation reduces exposure bias in diffusion models, 2023b.
  118. Diffusion models are minimax optimal distribution estimators, 2023.
  119. Generative diffusions in augmented spaces: A complete recipe, 2023.
  120. Diffusevae: Efficient, controllable and high-fidelity generation from low-dimensional latents. arXiv preprint arXiv:2201.00308, 2022.
  121. On calibrating diffusion probabilistic models, 2023.
  122. Wavelet diffusion models are fast and scalable image generators. arXiv preprint arXiv:2211.16152, 2022.
  123. Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988, 2022.
  124. Unbiased contrastive divergence algorithm for training energy-based latent variable models. In International Conference on Learning Representations, 2019.
  125. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 2022.
  126. Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32, 2019.
  127. Stochastic backpropagation and approximate inference in deep generative models. In International conference on machine learning, pages 1278–1286. PMLR, 2014.
  128. Generative modelling with inverse heat dissipation, 2023.
  129. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022a.
  130. Text-guided synthesis of artistic images with retrieval-augmented diffusion models, 2022b.
  131. Pyramidal denoising diffusion probabilistic models, 2022.
  132. Photorealistic text-to-image diffusion models with deep language understanding. arXiv preprint arXiv:2205.11487, 2022.
  133. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022.
  134. Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
  135. Noise estimation for generative diffusion models, 2021.
  136. Blackout diffusion: Generative diffusion models in discrete-state spaces, 2023.
  137. Generating high fidelity data from low-density regions using diffusion models, 2022.
  138. Conditional diffusion with less explicit guidance via model predictive control, 2022.
  139. Knn-diffusion: Image generation via large-scale retrieval, 2022.
  140. Parallel sampling of diffusion models, 2023.
  141. John Skilling. The eigenvalues of mega-dimensional matrices. In Maximum Entropy and Bayesian Methods, pages 455–466. Springer, 1989.
  142. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pages 2256–2265. PMLR, 2015.
  143. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  144. Generative modeling by estimating gradients of the data distribution. Advances in Neural Information Processing Systems, 32, 2019.
  145. Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020.
  146. How to train your energy-based models. arXiv preprint arXiv:2101.03288, 2021.
  147. Sliced score matching: A scalable approach to density and score estimation. In Uncertainty in Artificial Intelligence, pages 574–584. PMLR, 2020b.
  148. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020c.
  149. Maximum likelihood training of score-based diffusion models. Advances in Neural Information Processing Systems, 34:1415–1428, 2021.
  150. Consistency models. 2023.
  151. Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models, 2023.
  152. Dual diffusion implicit bridges for image-to-image translation. arXiv preprint arXiv:2203.08382, 2022.
  153. Accelerating diffusion sampling with classifier-based feature distillation. In 2023 IEEE International Conference on Multimedia and Expo (ICME), pages 810–815. IEEE, 2023.
  154. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016.
  155. A family of nonparametric density estimation algorithms. Communications on Pure and Applied Mathematics, 66(2):145–164, 2013.
  156. Density estimation by dual ascent of the log-likelihood. Communications in Mathematical Sciences, 8(1):217–233, 2010.
  157. It\\\backslash\^{{\{{o}}\}}-taylor sampling scheme for denoising diffusion probabilistic models using ideal derivatives. arXiv preprint arXiv:2112.13339, 2021.
  158. Efficient diffusion models for vision: A survey, 2022.
  159. Score-based generative modeling in latent space. Advances in Neural Information Processing Systems, 34:11287–11302, 2021.
  160. Pixel recurrent neural networks. In International conference on machine learning, pages 1747–1756. PMLR, 2016.
  161. Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
  162. Denoising diffusion samplers, 2023.
  163. Variational gaussian process diffusion processes, 2023.
  164. Pascal Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661–1674, 2011.
  165. Score-based denoising diffusion with non-isotropic gaussian noise models, 2022.
  166. End-to-end diffusion latent optimization improves classifier guidance, 2023.
  167. Learning fast samplers for diffusion models by differentiating through sample quality. In International Conference on Learning Representations, 2021a.
  168. Learning to efficiently sample from diffusion probabilistic models. arXiv preprint arXiv:2106.03802, 2021b.
  169. Unifying diffusion models’ latent space, with applications to cyclediffusion and guidance, 2022.
  170. Fast diffusion model, 2023.
  171. A closer look at parameter-efficient tuning in diffusion models, 2023.
  172. Tackling the generative learning trilemma with denoising diffusion gans. arXiv preprint arXiv:2112.07804, 2021.
  173. Poisson flow generative models. Advances in Neural Information Processing Systems, 35:16782–16795, 2022.
  174. Restart sampling for improving generative processes. arXiv preprint arXiv:2306.14878, 2023a.
  175. Pfgm++: Unlocking the potential of physics-inspired generative models. arXiv preprint arXiv:2302.04265, 2023b.
  176. Stable target field for reduced variance score estimation in diffusion models, 2023c.
  177. Diffusion models: A comprehensive survey of methods and applications. arXiv preprint arXiv:2209.00796, 2022a.
  178. Your vit is secretly a hybrid discriminative-generative diffusion model, 2022b.
  179. On the generalization of diffusion model, 2023.
  180. Text-to-image diffusion model in generative ai: A survey. arXiv preprint arXiv:2303.07909, 2023a.
  181. A survey on audio diffusion models: Text to speech synthesis and enhancement in generative ai. arXiv preprint arXiv:2303.13336, 2, 2023b.
  182. Dimensionality-varying diffusion process. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14307–14316, 2023c.
  183. Diffusion normalizing flow. Advances in Neural Information Processing Systems, 34:16280–16291, 2021.
  184. Fast sampling of diffusion models with exponential integrator. arXiv preprint arXiv:2204.13902, 2022.
  185. gddim: Generalized denoising diffusion implicit models. arXiv preprint arXiv:2206.05564, 2022.
  186. Fast sampling of diffusion models via operator learning. In International Conference on Machine Learning, pages 42390–42402. PMLR, 2023.
  187. Act: Asymptotic conditional transport. 2020.
  188. Truncated diffusion probabilistic models. stat, 1050:7, 2022.
  189. Diffusion models in nlp: A survey. arXiv preprint arXiv:2305.14671, 2023.
Citations (2)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube