Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Survey on Generative Diffusion Model (2209.02646v10)

Published 6 Sep 2022 in cs.AI

Abstract: Deep generative models have unlocked another profound realm of human creativity. By capturing and generalizing patterns within data, we have entered the epoch of all-encompassing Artificial Intelligence for General Creativity (AIGC). Notably, diffusion models, recognized as one of the paramount generative models, materialize human ideation into tangible instances across diverse domains, encompassing imagery, text, speech, biology, and healthcare. To provide advanced and comprehensive insights into diffusion, this survey comprehensively elucidates its developmental trajectory and future directions from three distinct angles: the fundamental formulation of diffusion, algorithmic enhancements, and the manifold applications of diffusion. Each layer is meticulously explored to offer a profound comprehension of its evolution. Structured and summarized approaches are presented in https://github.com/chq1155/A-Survey-on-Generative-Diffusion-Model.

Overview of Generative Diffusion Models

The paper provides a comprehensive survey on generative diffusion models, exploring their fundamental formulations, algorithmic improvements, and diverse applications across several domains. Diffusion models have emerged as a significant class of deep generative models, contributing to areas such as imagery, text, speech, biology, and healthcare.

Fundamental Formulations

Diffusion models, as discussed, revolve around a stochastic process that gradually transforms data distributions into a simpler prior, generally Gaussian, and reverses back during sampling. Three foundational formulations underline these processes:

  1. Denoised Diffusion Probabilistic Models (DDPM): DDPM employs a discrete forward process with a sequence of noise coefficients resulting in a predefined Gaussian noise. The reverse process denoises these samples using a learned neural network in a step-by-step manner.
  2. Score SDE Formulation: Extends the discrete-time methods to a continuous stochastic differential equation framework. This leverages ODEs and SDEs to improve integrability and flexibility.
  3. Conditional Diffusion Probabilistic Models: These models use conditions such as text or class labels, employing classifier-free guidance or classifier-based guidance to generate controllable outputs.

Algorithm Improvements

The paper delineates four primary areas of advancements that aim to improve diffusion models:

  1. Sampling Acceleration: Sampling in diffusion models inherently requires numerous iterations. Techniques like knowledge distillation, training-free sampling, and model merging with GANs and VAEs have been pursued to expedite sampling.
  2. Diffusion Process Design: Innovations have been made to improve the forward diffusion processes, including operating in latent spaces and on non-Euclidean spaces, enhancing the ease of reverse processes and broadening the scope of applicable domains.
  3. Likelihood Optimization: These strategies focus on optimizing the models' likelihood, improving the overall generative quality and learning efficiency.
  4. Bridging Distributions: Techniques have been developed to bridge arbitrary distributions, which is particularly useful for tasks like image-to-image translation.

Applications

Generative diffusion models find applications across multiple domains:

  • Image Generation: Models excel in generating high-fidelity images both conditionally (e.g., text-to-image synthesis) and unconditionally.
  • 3D and Video Generation: Bringing advancements to rendering 3D objects and video frames.
  • Medical Imaging: Used for super-resolution, denoising, and reconstruction, aiding diagnosis and treatment planning.
  • Text Generation: Assists in creating text based on conditions using parallel processing strategies.
  • Time Series and Audio Generation: Facilitates the synthesis of coherent sequences of data, aiding in prediction and transformation tasks.
  • Molecule and Graph Generation: Applied in science to model and predict molecular structures and interactions, significant for drug development.

Implications and Future Directions

The survey points to diffusion models as pivotal in generative modeling, offering robust frameworks for capturing complex data distributions. Future work is likely to focus on accelerating sampling methods, exploring new diffusion processes, and integrating with various machine learning paradigms to overcome limitations posed by large-scale and high-dimensional data. Furthermore, exploring more efficient methods for bridging distribution gaps could enhance their applicability across diverse fields like AI-driven scientific research and biomedical advancements.

This comprehensive survey underlines diffusion models' versatility and transformative potential, establishing them as prominent contributors to the generative modeling landscape, with ample room for future exploration and development.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (272)
  1. C. Doersch, “Tutorial on variational autoencoders,” arXiv preprint arXiv:1606.05908, 2016.
  2. D. P. Kingma, M. Welling et al., “An introduction to variational autoencoders,” Foundations and Trends® in Machine Learning, 2019.
  3. A. Oussidi and A. Elhassouny, “Deep generative models: Survey,” in ISCV.   IEEE, 2018.
  4. Y. LeCun, S. Chopra, R. Hadsell, M. Ranzato, and F. Huang, “A tutorial on energy-based learning,” Predicting structured data, 2006.
  5. J. Ngiam, Z. Chen, P. W. Koh, and A. Y. Ng, “Learning deep energy models,” in ICML, 2011.
  6. A. G. ALIAS PARTH GOYAL, N. R. Ke, S. Ganguli, and Y. Bengio, “Variational walkback: Learning a transition operator as a stochastic recurrent net,” NIPS, vol. 30, 2017.
  7. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” Communications of the ACM, 2020.
  8. A. Creswell, T. White, V. Dumoulin, K. Arulkumaran, B. Sengupta, and A. A. Bharath, “Generative adversarial networks: An overview,” IEEE Signal Process, 2018.
  9. L. Dinh, J. Sohl-Dickstein, and S. Bengio, “Density estimation using real nvp,” arXiv preprint arXiv:1605.08803, 2016.
  10. D. Rezende and S. Mohamed, “Variational inference with normalizing flows,” in ICML, 2015.
  11. I. Kobyzev, S. J. Prince, and M. A. Brubaker, “Normalizing flows: An introduction and review of current methods,” IEEE TPAMI, 2020.
  12. S. Bond-Taylor, A. Leach, Y. Long, and C. Willcocks, “Deep generative modelling: A comparative review of vaes, gans, normalizing flows, energy-based and autoregressive models.” IEEE TPAMI, 2021.
  13. J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, “Deep unsupervised learning using nonequilibrium thermodynamics,” in ICML, 2015.
  14. J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” NIPS, 2020.
  15. Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, “Score-based generative modeling through stochastic differential equations,” arXiv preprint arXiv:2011.13456, 2020.
  16. C. Saharia, W. Chan, H. Chang, C. A. Lee, J. Ho, T. Salimans, D. J. Fleet, and M. Norouzi, “Palette: Image-to-image diffusion models,” 2021.
  17. J. Ho, T. Salimans, A. Gritsenko, W. Chan, M. Norouzi, and D. J. Fleet, “Video diffusion models,” 2022.
  18. G. Batzolis, J. Stanczuk, C.-B. Schönlieb, and C. Etmann, “Conditional image generation with score-based diffusion models,” arXiv preprint arXiv:2111.13606, 2021.
  19. L. Theis, A. van den Oord, and M. Bethge, “A note on the evaluation of generative models,” in ICLR, 2016, pp. 1–10.
  20. H. Li, Y. Yang, M. Chang, S. Chen, H. Feng, Z. Xu, Q. Li, and Y. Chen, “Srdiff: Single image super-resolution with diffusion probabilistic models,” Neurocomputing, 2022.
  21. G. Giannone, D. Nielsen, and O. Winther, “Few-shot diffusion models,” arXiv preprint arXiv:2205.15463, 2022.
  22. X. Han, H. Zheng, and M. Zhou, “Card: Classification and regression diffusion models,” arXiv preprint arXiv:2206.07275, 2022.
  23. R. Huang, Z. Zhao, H. Liu, J. Liu, C. Cui, and Y. Ren, “Prodiff: Progressive fast diffusion model for high-quality text-to-speech,” arXiv preprint arXiv:2207.06389, 2022.
  24. S. Luo and W. Hu, “Diffusion probabilistic models for 3d point cloud generation,” in CVPR, 2021, pp. 2837–2845.
  25. Z. Lyu, Z. Kong, X. Xu, L. Pan, and D. Lin, “A conditional point diffusion-refinement paradigm for 3d point cloud completion,” arXiv preprint arXiv:2112.03530, 2021.
  26. X. L. Li, J. Thickstun, I. Gulrajani, P. Liang, and T. B. Hashimoto, “Diffusion-lm improves controllable text generation,” arXiv preprint arXiv:2205.14217, 2022.
  27. T. Chen, R. Zhang, and G. Hinton, “Analog bits: Generating discrete data using diffusion models with self-conditioning,” arXiv preprint arXiv:2208.04202, 2022.
  28. Y. Tashiro, J. Song, Y. Song, and S. Ermon, “Csdi: Conditional score-based diffusion models for probabilistic time series imputation,” NIPS, vol. 34, pp. 24 804–24 816, 2021.
  29. J. M. L. Alcaraz and N. Strodthoff, “Diffusion-based time series imputation and forecasting with structured state space models,” arXiv preprint arXiv:2208.09399, 2022.
  30. N. Chen, Y. Zhang, H. Zen, R. J. Weiss, M. Norouzi, and W. Chan, “Wavegrad: Estimating gradients for waveform generation,” in ICLR, 2020.
  31. Z. Kong, W. Ping, J. Huang, K. Zhao, and B. Catanzaro, “Diffwave: A versatile diffusion model for audio synthesis,” in ICLR, 2020.
  32. V. Popov, I. Vovk, V. Gogoryan, T. Sadekova, and M. Kudinov, “Grad-tts: A diffusion probabilistic model for text-to-speech,” in ICML.   PMLR, 2021, pp. 8599–8608.
  33. J. Liu, C. Li, Y. Ren, F. Chen, and Z. Zhao, “Diffsinger: Singing voice synthesis via shallow diffusion mechanism,” in AAAI, vol. 36, no. 10, 2022, pp. 11 020–11 028.
  34. J. Tae, H. Kim, and T. Kim, “Editts: Score-based editing for controllable text-to-speech,” arXiv preprint arXiv:2110.02584, 2021.
  35. S. Luo, C. Shi, M. Xu, and J. Tang, “Predicting molecular conformation via dynamic graph score matching,” NIPS, vol. 34, pp. 19 784–19 795, 2021.
  36. M. Xu, L. Yu, Y. Song, C. Shi, S. Ermon, and J. Tang, “Geodiff: A geometric diffusion model for molecular conformation generation,” in ICLR, 2021.
  37. B. Jing, G. Corso, R. Barzilay, and T. S. Jaakkola, “Torsional diffusion for molecular conformer generation,” in ICLR, 2022.
  38. S. Luo, Y. Su, X. Peng, S. Wang, J. Peng, and J. Ma, “Antigen-specific antibody design and optimization with diffusion-based generative models,” bioRxiv, 2022.
  39. N. Anand and T. Achim, “Protein structure and sequence generation with equivariant denoising diffusion probabilistic models,” arXiv preprint arXiv:2205.15019, 2022.
  40. T. Karras, M. Aittala, T. Aila, and S. Laine, “Elucidating the design space of diffusion-based generative models,” arXiv preprint arXiv:2206.00364, 2022.
  41. P. Vincent, “A connection between score matching and denoising autoencoders,” Neural computation, 2011.
  42. P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,” NIPS, vol. 34, pp. 8780–8794, 2021.
  43. C. Lu, Y. Zhou, F. Bao, J. Chen, C. Li, and J. Zhu, “Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models,” arXiv preprint arXiv:2211.01095, 2022.
  44. Q. Zhang and Y. Chen, “Fast sampling of diffusion models with exponential integrator,” arXiv preprint arXiv:2204.13902, 2022.
  45. Y. Xu, M. Deng, X. Cheng, Y. Tian, Z. Liu, and T. Jaakkola, “Restart sampling for improving generative processes,” ArXiv, vol. abs/2306.14878, 2023.
  46. T. Salimans and J. Ho, “Progressive distillation for fast sampling of diffusion models,” arXiv, 2022.
  47. Y. Xu, Z. Liu, M. Tegmark, and T. Jaakkola, “Poisson flow generative models,” ArXiv, vol. abs/2209.11178, 2022.
  48. Y. Xu, Z. Liu, Y. Tian, S. Tong, M. Tegmark, and T. Jaakkola, “Pfgm++: Unlocking the potential of physics-inspired generative models,” ArXiv, vol. abs/2302.04265, 2023.
  49. T. Dockhorn, A. Vahdat, and K. Kreis, “Score-based generative modeling with critically-damped langevin diffusion,” arXiv preprint arXiv:2112.07068, 2021.
  50. A. Vahdat, K. Kreis, and J. Kautz, “Score-based generative modeling in latent space,” NIPS, vol. 34, pp. 11 287–11 302, 2021.
  51. R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-resolution image synthesis with latent diffusion models,” in CVPR, 2022, pp. 10 684–10 695.
  52. X. Liu, C. Gong, and Q. Liu, “Flow straight and fast: Learning to generate and transfer data with rectified flow,” ArXiv, vol. abs/2209.03003, 2022.
  53. M. S. Albergo and E. Vanden-Eijnden, “Building normalizing flows with stochastic interpolants,” ArXiv, vol. abs/2209.15571, 2022.
  54. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” 2014.
  55. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
  56. G. Papamakarios, E. T. Nalisnick, D. J. Rezende, S. Mohamed, and B. Lakshminarayanan, “Normalizing flows for probabilistic modeling and inference.” J. Mach. Learn. Res., vol. 22, no. 57, pp. 1–64, 2021.
  57. C. Jarzynski, “Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach,” Physical Review E, vol. 56, pp. 5018–5035, 1997.
  58. A. Q. Nichol and P. Dhariwal, “Improved denoising diffusion probabilistic models,” in ICML, 2021.
  59. L. Arnold, “Stochastic differential equations,” New York, 1974.
  60. B. D. Anderson, “Reverse-time diffusion equation models,” Stochastic Processes and their Applications, vol. 12, no. 3, pp. 313–326, 1982.
  61. Y. Xu, S. Tong, and T. Jaakkola, “Stable target field for reduced variance score estimation in diffusion models,” ArXiv, vol. abs/2302.00670, 2023.
  62. D. Maoutsa, S. Reich, and M. Opper, “Interacting particle solutions of fokker–planck equations through gradient–log–density estimation,” Entropy, vol. 22, no. 8, p. 802, 2020.
  63. R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud, “Neural ordinary differential equations,” NIPS, vol. 31, 2018.
  64. L. Liu, Y. Ren, Z. Lin, and Z. Zhao, “Pseudo numerical methods for diffusion models on manifolds,” arXiv preprint arXiv:2202.09778, 2022.
  65. C. Lu, Y. Zhou, F. Bao, J. Chen, C. Li, and J. Zhu, “Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps,” arXiv preprint arXiv:2206.00927, 2022.
  66. J. Ho and T. Salimans, “Classifier-free diffusion guidance,” arXiv preprint arXiv:2207.12598, 2022.
  67. A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew, I. Sutskever, and M. Chen, “Glide: Towards photorealistic image generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2112.10741, 2021.
  68. C. Meng, R. Gao, D. P. Kingma, S. Ermon, J. Ho, and T. Salimans, “On distillation of guided diffusion models,” arXiv preprint arXiv:2210.03142, 2022.
  69. M. Hu, Y. Wang, T.-J. Cham, J. Yang, and P. N. Suganthan, “Global context with discrete diffusion in vector quantised modelling for image generation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 11 502–11 511.
  70. J. Wolleb, F. Bieder, R. Sandkühler, and P. C. Cattin, “Diffusion models for medical anomaly detection,” arXiv preprint arXiv:2203.04306, 2022.
  71. K. Packhäuser, L. Folle, F. Thamm, and A. Maier, “Generation of anonymous chest radiographs using latent diffusion models for training thoracic abnormality classification systems,” arXiv preprint arXiv:2211.01323, 2022.
  72. S. Chen, P. Sun, Y. Song, and P. Luo, “Diffusiondet: Diffusion model for object detection,” arXiv preprint arXiv:2211.09788, 2022.
  73. D. Baranchuk, I. Rubachev, A. Voynov, V. Khrulkov, and A. Babenko, “Label-efficient semantic segmentation with diffusion models,” arXiv preprint arXiv:2112.03126, 2021.
  74. V. T. Hu, D. W. Zhang, Y. M. Asano, G. J. Burghouts, and C. G. Snoek, “Self-guided diffusion models,” arXiv preprint arXiv:2210.06462, 2022.
  75. C.-H. Chao, W.-F. Sun, B.-W. Cheng, and C.-Y. Lee, “Quasi-conservative score-based generative models,” arXiv preprint arXiv:2209.12753, 2022.
  76. H. Chung, B. Sim, and J. C. Ye, “Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction,” in CVPR, 2022.
  77. J. Choi, S. Kim, Y. Jeong, Y. Gwon, and S. Yoon, “Ilvr: Conditioning method for denoising diffusion probabilistic models,” in CVPR, 2021, pp. 14 367–14 376.
  78. E. Luhman and T. Luhman, “Knowledge distillation in iterative generative models for improved sampling speed,” arXiv, 2021.
  79. W. Sun, D. Chen, C. Wang, D. Ye, Y. Feng, and C. Chen, “Accelerating diffusion sampling with classifier-based feature distillation,” arXiv preprint arXiv:2211.12039, 2022.
  80. D. Berthelot, A. Autef, J. Lin, D. A. Yap, S. Zhai, S. Hu, D. Zheng, W. Talbot, and E. Gu, “Tract: Denoising diffusion models with transitive closure time-distillation,” arXiv preprint arXiv:2303.04248, 2023.
  81. X. Liu, C. Gong et al., “Flow straight and fast: Learning to generate and transfer data with rectified flow,” in NeurIPS 2022 Workshop on Score-Based Methods, 2022.
  82. H. Zheng, W. Nie, A. Vahdat, K. Azizzadenesheli, and A. Anandkumar, “Fast sampling of diffusion models via operator learning,” arXiv preprint arXiv:2211.13449, 2022.
  83. Y. Fan and K. Lee, “Optimizing ddpm sampling with shortcut fine-tuning,” arXiv preprint arXiv:2301.13362, 2023.
  84. E. Aiello, D. Valsesia, and E. Magli, “Fast inference in denoising diffusion models via mmd finetuning,” arXiv preprint arXiv:2301.07969, 2023.
  85. S. Lee, B. Kim, and J. C. Ye, “Minimizing trajectory curvature of ode-based generative models,” arXiv preprint arXiv:2301.12003, 2023.
  86. C. Meng, R. Gao, D. P. Kingma, S. Ermon, J. Ho, and T. Salimans, “On distillation of guided diffusion models,” ArXiv, vol. abs/2210.03142, 2022.
  87. G.-H. Liu, A. Vahdat, D.-A. Huang, E. A. Theodorou, W. Nie, and A. Anandkumar, “I2sb: Image-to-image schrödinger bridge,” ArXiv, vol. abs/2302.05872, 2023.
  88. X. Su, J. Song, C. Meng, and S. Ermon, “Dual diffusion implicit bridges for image-to-image translation,” arXiv preprint arXiv:2203.08382, 2022.
  89. H. Zheng, P. He, W. Chen, and M. Zhou, “Truncated diffusion probabilistic models,” arXiv preprint arXiv:2202.09671, 2022.
  90. E. Hoogeboom and T. Salimans, “Blurring diffusion models,” arXiv preprint arXiv:2209.05557, 2022.
  91. Z. Lyu, X. Xu, C. Yang, D. Lin, and B. Dai, “Accelerating diffusion models via early stop of the diffusion process,” arXiv, 2022.
  92. G. Daras, M. Delbracio, H. Talebi, A. G. Dimakis, and P. Milanfar, “Soft diffusion: Score matching for general corruptions,” arXiv preprint arXiv:2209.05442, 2022.
  93. G. Franzese, S. Rossi, L. Yang, A. Finamore, D. Rossi, M. Filippone, and P. Michiardi, “How much is enough? a study on diffusion times in score-based generative models.”
  94. V. Khrulkov and I. Oseledets, “Understanding ddpm latent codes through optimal transport,” arXiv preprint arXiv:2202.07477, 2022.
  95. Z. Kong and W. Ping, “On fast sampling of diffusion probabilistic models,” arXiv preprint arXiv:2106.00132, 2021.
  96. D. Kingma, T. Salimans, B. Poole, and J. Ho, “Variational diffusion models,” NIPS, vol. 34, pp. 21 696–21 707, 2021.
  97. R. San-Roman, E. Nachmani, and L. Wolf, “Noise estimation for generative diffusion models,” arXiv preprint arXiv:2104.02600, 2021.
  98. J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” in ICLR, 2020.
  99. Q. Zhang, M. Tao, and Y. Chen, “gddim: Generalized denoising diffusion implicit models,” arXiv preprint arXiv:2206.05564, 2022.
  100. F. Bao, C. Li, J. Zhu, and B. Zhang, “Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models,” arXiv preprint arXiv:2201.06503, 2022.
  101. F. Bao, C. Li, J. Sun, J. Zhu, and B. Zhang, “Estimating the optimal covariance with imperfect mean in diffusion probabilistic models,” arXiv preprint arXiv:2206.07309, 2022.
  102. D. Watson, W. Chan, J. Ho, and M. Norouzi, “Learning fast samplers for diffusion models by differentiating through sample quality.”
  103. D. Watson, J. Ho, M. Norouzi, and W. Chan, “Learning to efficiently sample from diffusion probabilistic models,” arXiv, 2021.
  104. Z. Xiao, K. Kreis, and A. Vahdat, “Tackling the generative learning trilemma with denoising diffusion gans,” arXiv, 2021.
  105. K. Pandey, A. Mukherjee, P. Rai, and A. Kumar, “Diffusevae: Efficient, controllable and high-fidelity generation from low-dimensional latents.”
  106. D. Kim, B. Na, S. J. Kwon, D. Lee, W. Kang, and I.-C. Moon, “Maximum likelihood training of implicit nonlinear diffusion models,” arXiv preprint arXiv:2205.13699, 2022.
  107. H. Zhang, R. Feng, Z. Yang, L. Huang, Y. Liu, Y. Zhang, Y. Shen, D. Zhao, J. Zhou, and F. Cheng, “Dimensionality-varying diffusion process,” arXiv preprint arXiv:2211.16032, 2022.
  108. A. Bansal, E. Borgnia, H.-M. Chu, J. S. Li, H. Kazemi, F. Huang, M. Goldblum, J. Geiping, and T. Goldstein, “Cold diffusion: Inverting arbitrary image transforms without noise,” arXiv preprint arXiv:2208.09392, 2022.
  109. Y. Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le, “Flow matching for generative modeling,” ArXiv, vol. abs/2210.02747, 2022.
  110. J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. van den Berg, “Structured denoising diffusion models in discrete state-spaces,” NIPS, 2021.
  111. E. Hoogeboom, D. Nielsen, P. Jaini, P. Forré, and M. Welling, “Argmax flows and multinomial diffusion: Towards non-autoregressive language models,” NIPS, 2021.
  112. E. Hoogeboom, A. A. Gritsenko, J. Bastings, B. Poole, R. v. d. Berg, and T. Salimans, “Autoregressive diffusion models,” arXiv preprint arXiv:2110.02037, 2021.
  113. A. Campbell, J. Benton, V. De Bortoli, T. Rainforth, G. Deligiannidis, and A. Doucet, “A continuous time framework for discrete denoising models,” arXiv preprint arXiv:2205.14987, 2022.
  114. S. Gu, D. Chen, J. Bao, F. Wen, B. Zhang, D. Chen, L. Yuan, and B. Guo, “Vector quantized diffusion model for text-to-image synthesis,” in CVPR, 2022, pp. 10 696–10 706.
  115. Z. Tang, S. Gu, J. Bao, D. Chen, and F. Wen, “Improved vector quantized diffusion models,” arXiv preprint arXiv:2205.16007, 2022.
  116. V. De Bortoli, E. Mathieu, M. Hutchinson, J. Thornton, Y. W. Teh, and A. Doucet, “Riemannian score-based generative modeling,” arXiv preprint arXiv:2202.02763, 2022.
  117. C.-W. Huang, M. Aghajohari, A. J. Bose, P. Panangaden, and A. Courville, “Riemannian diffusion models,” arXiv preprint arXiv:2208.07949, 2022.
  118. L. Luzi, A. Siahkoohi, P. M. Mayer, J. Casco-Rodriguez, and R. Baraniuk, “Boomerang: Local sampling on image manifolds using diffusion models,” arXiv preprint arXiv:2210.12100, 2022.
  119. X. Cheng, J. Zhang, and S. Sra, “Theory and algorithms for diffusion processes on riemannian manifolds,” arXiv preprint arXiv:2204.13665, 2022.
  120. C. Niu, Y. Song, J. Song, S. Zhao, A. Grover, and S. Ermon, “Permutation invariant graph generation via score-based generative modeling,” in AISTATS.   PMLR, 2020, pp. 4474–4484.
  121. H. Huang, L. Sun, B. Du, Y. Fu, and W. Lv, “Graphgdp: Generative diffusion processes for permutation invariant graph generation,” arXiv preprint arXiv:2212.01842, 2022.
  122. X. Chen, Y. Li, A. Zhang, and L.-p. Liu, “Nvdiff: Graph generation through the diffusion of node vectors,” arXiv preprint arXiv:2211.10794, 2022.
  123. T. Luo, Z. Mo, and S. J. Pan, “Fast graph generative model via spectral diffusion,” arXiv preprint arXiv:2211.08892, 2022.
  124. Y. Song, C. Durkan, I. Murray, and S. Ermon, “Maximum likelihood training of score-based diffusion models,” NIPS, vol. 34, pp. 1415–1428, 2021.
  125. C.-W. Huang, J. H. Lim, and A. C. Courville, “A variational perspective on diffusion-based generative models and score matching,” NIPS, 2021.
  126. C. Lu, K. Zheng, F. Bao, J. Chen, C. Li, and J. Zhu, “Maximum likelihood training for score-based diffusion odes by high-order denoising score matching,” in International Conference on Machine Learning, 2022.
  127. E. Heitz, L. Belcour, and T. Chambon, “Iterative α𝛼\alphaitalic_α-(de)blending: a minimalist deterministic diffusion model,” ArXiv, vol. abs/2305.03486, 2023.
  128. R. G. Lopes, S. Fenu, and T. Starner, “Data-free knowledge distillation for deep neural networks,” arXiv preprint arXiv:1710.07535, 2017.
  129. J. Gou, B. Yu, S. J. Maybank, and D. Tao, “Knowledge distillation: A survey,” IJCV, 2021.
  130. Y. Song, P. Dhariwal, M. Chen, and I. Sutskever, “Consistency models,” ArXiv, vol. abs/2303.01469, 2023.
  131. C. Villani, “Topics in optimal transportation,” Graduate Studies in Mathematics, 2003.
  132. H. Tsukamoto, S.-J. Chung, and J.-J. E. Slotine, “Contraction theory for nonlinear stability analysis and learning-based control: A tutorial overview,” Annual Reviews in Control, 2021.
  133. N. V. Hung, S. Migórski, V. M. Tam, and S. Zeng, “Gap functions and error bounds for variational–hemivariational inequalities,” Acta Applicandae Mathematicae, 2020.
  134. H. Zheng and M. Zhou, “Act: Asymptotic conditional transport,” arxiv, 2020.
  135. A. Jolicoeur-Martineau, K. Li, R. Piché-Taillefer, T. Kachman, and I. Mitliagkas, “Gotta go fast when generating data with score-based models,” arXiv preprint arXiv:2105.14080, 2021.
  136. R. Bellman, “Dynamic programming,” Science, 1966.
  137. D. Watson, W. Chan, J. Ho, and M. Norouzi, “Learning fast samplers for diffusion models by differentiating through sample quality,” in ICLR, 2021.
  138. D. Kim, B. Na, S. J. Kwon, D. Lee, W. Kang, and I.-c. Moon, “Maximum likelihood training of parametrized diffusion model,” arxiv, 2021.
  139. C. Huang, Z. Liu, S. Bai, L. Zhang, C. Xu, Z. WANG, Y. Xiang, and Y. Xiong, “Pf-abgen: A reliable and efficient antibody generator via poisson flow,” in ICLR 2023-Machine Learning for Drug Discovery workshop, 2023.
  140. R. Ge, Y. He, C. Xia, Y. Chen, D. Zhang, and G. Wang, “Jccs-pfgm: A novel circle-supervision based poisson flow generative model for multiphase cect progressive low-dose reconstruction with joint condition,” 2023.
  141. Z. Liu, D. Luo, Y. Xu, T. Jaakkola, and M. Tegmark, “Genphys: From physical processes to generative models,” ArXiv, vol. abs/2304.02637, 2023.
  142. S. Rissanen, M. Heinonen, and A. Solin, “Generative modelling with inverse heat dissipation,” ArXiv, vol. abs/2206.13397, 2022.
  143. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” NIPS, 2017.
  144. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
  145. A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with clip latents,” arXiv preprint arXiv:2204.06125, 2022.
  146. J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko et al., “Highly accurate protein structure prediction with alphafold,” Nature, 2021.
  147. S. Ovchinnikov and P.-S. Huang, “Structure-based protein design with deep learning,” Current opinion in chemical biology, 2021.
  148. A. Van Den Oord, O. Vinyals et al., “Neural discrete representation learning,” NIPS, vol. 30, 2017.
  149. M. Cohen, G. Quispe, S. L. Corff, C. Ollion, and E. Moulines, “Diffusion bridges vector quantized variational autoencoders,” arXiv preprint arXiv:2202.04895, 2022.
  150. P. Xie, Q. Zhang, Z. Li, H. Tang, Y. Du, and X. Hu, “Vector quantized diffusion model with codeunet for text-to-sign pose sequences generation,” arXiv preprint arXiv:2208.09141, 2022.
  151. C. Guo, S. Zou, X. Zuo, S. Wang, W. Ji, X. Li, and L. Cheng, “Generating diverse and natural 3d human motions from text,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5152–5161.
  152. S. Weinbach, M. Bellagente, C. Eichenberg, A. Dai, R. Baldock, S. Nanda, B. Deiseroth, K. Oostermeijer, H. Teufel, and A. F. Cruz-Salinas, “M-vader: A model for diffusion with multimodal context,” arXiv preprint arXiv:2212.02936, 2022.
  153. X. Xu, Z. Wang, E. Zhang, K. Wang, and H. Shi, “Versatile diffusion: Text, images and variations all in one diffusion model,” arXiv preprint arXiv:2211.08332, 2022.
  154. H. A. Pierson and M. S. Gashler, “Deep learning in robotics: a review of recent research,” Advanced Robotics, 2017.
  155. R. P. De Lima, K. Marfurt, D. Duarte, and A. Bonar, “Progress and challenges in deep learning analysis of geoscience images,” in 81st EAGE Conference and Exhibition 2019.   European Association of Geoscientists & Engineers, 2019.
  156. J. Wang, H. Cao, J. Z. Zhang, and Y. Qi, “Computational protein design with deep learning neural networks,” Scientific reports, 2018.
  157. W. Cao, Z. Yan, Z. He, and Z. He, “A comprehensive survey on geometric deep learning,” IEEE Access, 2020.
  158. L. Wu, H. Lin, Z. Gao, C. Tan, and S. Z. Li, “Self-supervised on graphs: Contrastive, generative, or predictive,” IEEE TKDE, 2021.
  159. H. Lin, Y. Huang, M. Liu, X. Li, S. Ji, and S. Z. Li, “Diffbp: Generative diffusion of 3d molecules for target protein binding,” arXiv preprint arXiv:2211.11214, 2022.
  160. V. Dutordoir, A. Saul, Z. Ghahramani, and F. Simpson, “Neural diffusion processes,” arXiv preprint arXiv:2206.03992, 2022.
  161. A. Ulhaq, N. Akhtar, and G. Pogrebna, “Efficient diffusion models for vision: A survey,” arXiv preprint arXiv:2210.09292, 2022.
  162. Y. Song and S. Ermon, “Generative modeling by estimating gradients of the data distribution,” NIPS.
  163. B. Kawar, M. Elad, S. Ermon, and J. Song, “Denoising diffusion restoration models,” in ICLR Workshop, 2022.
  164. C. Saharia, W. Chan, H. Chang, C. Lee, J. Ho, T. Salimans, D. Fleet, and M. Norouzi, “Palette: Image-to-image diffusion models,” in ACM SIGGRAPH, 2022, pp. 1–10.
  165. L. Theis, T. Salimans, M. D. Hoffman, and F. Mentzer, “Lossy compression with gaussian diffusion,” arXiv preprint arXiv:2206.08889, 2022.
  166. A. Lugmayr, M. Danelljan, A. Romero, F. Yu, R. Timofte, and L. Van Gool, “Repaint: Inpainting using denoising diffusion probabilistic models,” in CVPR, 2022, pp. 11 461–11 471.
  167. A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in ICML.   PMLR, 2021, pp. 8748–8763.
  168. B. Poole, A. Jain, J. T. Barron, and B. Mildenhall, “Dreamfusion: Text-to-3d using 2d diffusion,” arXiv preprint arXiv:2209.14988, 2022.
  169. T. Amit, E. Nachmani, T. Shaharbany, and L. Wolf, “Segdiff: Image segmentation with diffusion probabilistic models,” arXiv preprint arXiv:2112.00390, 2021.
  170. P. Sanchez and S. A. Tsaftaris, “Diffusion causal models for counterfactual estimation,” in First Conference on Causal Learning and Reasoning, 2021.
  171. L. Zhou, Y. Du, and J. Wu, “3d shape generation and completion through point-voxel diffusion,” in ICCV, 2021, pp. 5826–5835.
  172. S. Luo and W. Hu, “Score-based point cloud denoising,” in ICCV, 2021, pp. 4583–4592.
  173. R. Yang, P. Srivastava, and S. Mandt, “Diffusion probabilistic modeling for video generation,” arXiv preprint arXiv:2203.09481, 2022.
  174. W. Harvey, S. Naderiparizi, V. Masrani, C. Weilbach, and F. Wood, “Flexible diffusion modeling of long videos,” arXiv preprint arXiv:2205.11495, 2022.
  175. V. Voleti, A. Jolicoeur-Martineau, and C. Pal, “Mcvd: Masked conditional video diffusion for prediction, generation, and interpolation,” arXiv preprint arXiv:2205.09853, 2022.
  176. T. Höppe, A. Mehrjou, S. Bauer, D. Nielsen, and A. Dittadi, “Diffusion models for video prediction and infilling,” arXiv preprint arXiv:2206.07696, 2022.
  177. H. Chung and J. C. Ye, “Score-based diffusion models for accelerated mri,” Medical Image Analysis, p. 102479, 2022.
  178. Y. Song, L. Shen, L. Xing, and S. Ermon, “Solving inverse problems in medical imaging with score-based generative models,” in ICLR, 2021.
  179. H. Chung, E. S. Lee, and J. C. Ye, “Mr image denoising and super-resolution using regularized reverse diffusion,” arXiv preprint arXiv:2203.12621, 2022.
  180. P. Sanchez, A. Kascenas, X. Liu, A. Q. O’Neil, and S. A. Tsaftaris, “What is healthy? generative counterfactual diffusion for lesion localization,” arXiv preprint arXiv:2207.12268, 2022.
  181. J. Singh, S. Gould, and L. Zheng, “High-fidelity guided image synthesis with latent diffusion models,” arXiv preprint arXiv:2211.17084, 2022.
  182. S. Welker, H. N. Chapman, and T. Gerkmann, “Driftrec: Adapting diffusion models to blind image restoration tasks,” arXiv preprint arXiv:2211.06757, 2022.
  183. B. Kawar, J. Song, S. Ermon, and M. Elad, “Jpeg artifact correction using denoising diffusion restoration models,” arXiv preprint arXiv:2209.11888, 2022.
  184. Y. Yin, L. Huang, Y. Liu, and K. Huang, “Diffgar: Model-agnostic restoration from generative artifacts using image-to-image diffusion models,” arXiv preprint arXiv:2210.08573, 2022.
  185. J. Whang, M. Delbracio, H. Talebi, C. Saharia, A. G. Dimakis, and P. Milanfar, “Deblurring via stochastic refinement,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16 293–16 303.
  186. G. Batzolis, J. Stanczuk, C.-B. Schönlieb, and C. Etmann, “Non-uniform diffusion models,” arXiv preprint arXiv:2207.09786, 2022.
  187. M. Ren, M. Delbracio, H. Talebi, G. Gerig, and P. Milanfar, “Image deblurring with domain generalizable diffusion models,” arXiv preprint arXiv:2212.01789, 2022.
  188. X. Wang, J.-K. Yan, J.-Y. Cai, J.-H. Deng, Q. Qin, Q. Wang, H. Xiao, Y. Cheng, and P.-F. Ye, “Superresolution reconstruction of single image for latent features,” arXiv preprint arXiv:2211.12845, 2022.
  189. V. Voleti, C. Pal, and A. Oberman, “Score-based denoising diffusion with non-isotropic gaussian noise models,” arXiv preprint arXiv:2210.12254, 2022.
  190. T. Dockhorn, A. Vahdat, and K. Kreis, “Genie: Higher-order denoising diffusion solvers,” arXiv preprint arXiv:2210.05475, 2022.
  191. Y. Nikankin, N. Haim, and M. Irani, “Sinfusion: Training diffusion models on a single image or video,” arXiv preprint arXiv:2211.11743, 2022.
  192. G. Kwon and J. C. Ye, “Diffusion-based image translation using disentangled style and content representation,” arXiv preprint arXiv:2209.15264, 2022.
  193. B. Kolbeinsson and K. Mikolajczyk, “Multi-class segmentation from aerial views using recursive noise diffusion,” arXiv preprint arXiv:2212.00787, 2022.
  194. Z. Gu, H. Chen, Z. Xu, J. Lan, C. Meng, and W. Wang, “Diffusioninst: Diffusion model for instance segmentation,” arXiv preprint arXiv:2212.02773, 2022.
  195. J. Wolleb, R. Sandkühler, F. Bieder, P. Valmaggia, and P. C. Cattin, “Diffusion models for implicit image segmentation ensembles,” arXiv preprint arXiv:2112.03145, 2021.
  196. C. Saharia, W. Chan, S. Saxena, L. Li, J. Whang, E. Denton, S. K. S. Ghasemipour, B. K. Ayan, S. S. Mahdavi, R. G. Lopes et al., “Photorealistic text-to-image diffusion models with deep language understanding,” arXiv preprint arXiv:2205.11487, 2022.
  197. G. Kim and J. C. Ye, “Diffusionclip: Text-guided image manipulation using diffusion models,” 2021.
  198. G. Li, H. Zheng, C. Wang, C. Li, C. Zheng, and D. Tao, “3ddesigner: Towards photorealistic 3d object generation and editing with text-guided diffusion models,” arXiv preprint arXiv:2211.14108, 2022.
  199. X. Zeng, A. Vahdat, F. Williams, Z. Gojcic, O. Litany, S. Fidler, and K. Kreis, “Lion: Latent point diffusion models for 3d shape generation,” arXiv preprint arXiv:2210.06978, 2022.
  200. G. Metzer, E. Richardson, O. Patashnik, R. Giryes, and D. Cohen-Or, “Latent-nerf for shape-guided generation of 3d shapes and textures,” arXiv preprint arXiv:2211.07600, 2022.
  201. G. Nam, M. Khlifi, A. Rodriguez, A. Tono, L. Zhou, and P. Guerrero, “3d-ldm: Neural implicit 3d shape generation with latent diffusion models,” arXiv preprint arXiv:2212.00842, 2022.
  202. J. R. Shue, E. R. Chan, R. Po, Z. Ankner, J. Wu, and G. Wetzstein, “3d neural field generation using triplane diffusion,” arXiv preprint arXiv:2211.16677, 2022.
  203. A.-C. Cheng, X. Li, S. Liu, M. Sun, and M.-H. Yang, “Autoregressive 3d shape generation via canonical mapping,” arXiv preprint arXiv:2204.01955, 2022.
  204. G. Shim, M. Lee, and J. Choo, “Refu: Refine and fuse the unobserved view for detail-preserving single-image 3d human reconstruction,” in Proceedings of the 30th ACM International Conference on Multimedia, 2022, pp. 6850–6859.
  205. D. Wei, H. Sun, B. Li, J. Lu, W. Li, X. Sun, and S. Hu, “Human joint kinematics diffusion-refinement for stochastic motion prediction,” arXiv preprint arXiv:2210.05976, 2022.
  206. Y. Yuan, J. Song, U. Iqbal, A. Vahdat, and J. Kautz, “Physdiff: Physics-guided human motion diffusion model,” arXiv preprint arXiv:2212.02500, 2022.
  207. J. Gong, L. G. Foo, Z. Fan, Q. Ke, H. Rahmani, and J. Liu, “Diffpose: Toward more reliable 3d pose estimation,” arXiv preprint arXiv:2211.16940, 2022.
  208. K. Holmquist and B. Wandt, “Diffpose: Multi-hypothesis human pose estimation using diffusion models,” arXiv preprint arXiv:2211.16487, 2022.
  209. G. Tevet, S. Raab, B. Gordon, Y. Shafir, D. Cohen-Or, and A. H. Bermano, “Human motion diffusion model,” arXiv preprint arXiv:2209.14916, 2022.
  210. Y.-C. Cheng, H.-Y. Lee, S. Tulyakov, A. Schwing, and L. Gui, “Sdfusion: Multimodal 3d shape completion, reconstruction, and generation,” arXiv preprint arXiv:2212.04493, 2022.
  211. K. Mei and V. M. Patel, “Vidm: Video implicit diffusion models,” arXiv preprint arXiv:2212.00235, 2022.
  212. U. Singer, A. Polyak, T. Hayes, X. Yin, J. An, S. Zhang, Q. Hu, H. Yang, O. Ashual, O. Gafni et al., “Make-a-video: Text-to-video generation without text-video data,” arXiv preprint arXiv:2209.14792, 2022.
  213. G. Kim, H. Shim, H. Kim, Y. Choi, J. Kim, and E. Yang, “Diffusion video autoencoders: Toward temporally consistent face video editing via disentangled video encoding,” arXiv preprint arXiv:2212.02802, 2022.
  214. Z.-X. Cui, C. Cao, S. Liu, Q. Zhu, J. Cheng, H. Wang, Y. Zhu, and D. Liang, “Self-score: Self-supervised learning on score-based models for mri reconstruction,” arXiv preprint arXiv:2209.00835, 2022.
  215. A. Jalal, M. Arvinte, G. Daras, E. Price, A. G. Dimakis, and J. Tamir, “Robust compressed sensing mri with deep generative priors,” Advances in Neural Information Processing Systems, vol. 34, pp. 14 938–14 954, 2021.
  216. P. Rouzrokh, B. Khosravi, S. Faghani, M. Moassefi, S. Vahdati, and B. J. Erickson, “Multitask brain tumor inpainting with diffusion models: A methodological report,” arXiv preprint arXiv:2210.12113, 2022.
  217. J. Wyatt, A. Leach, S. M. Schmon, and C. G. Willcocks, “Anoddpm: Anomaly detection with denoising diffusion probabilistic models using simplex noise,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 650–656.
  218. J. Wu, H. Fang, Y. Zhang, Y. Yang, and Y. Xu, “Medsegdiff: Medical image segmentation with diffusion probabilistic model,” arXiv preprint arXiv:2211.00611, 2022.
  219. Z. He, T. Sun, K. Wang, X. Huang, and X. Qiu, “Diffusionbert: Improving generative masked language models with diffusion models,” arXiv preprint arXiv:2211.15029, 2022.
  220. S. W. Park, K. Lee, and J. Kwon, “Neural markov controlled sde: Stochastic optimization for continuous-time data,” in ICLR, 2021.
  221. K. Rasul, C. Seward, I. Schuster, and R. Vollgraf, “Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting,” in International Conference on Machine Learning.   PMLR, 2021, pp. 8857–8868.
  222. J. Lee and S. Han, “Nu-wave: A diffusion probabilistic model for neural audio upsampling,” arXiv preprint arXiv:2104.02321, 2021.
  223. Z. Qiu, M. Fu, Y. Yu, L. Yin, F. Sun, and H. Huang, “Srtnet: Time domain speech enhancement via stochastic refinement,” arXiv preprint arXiv:2210.16805, 2022.
  224. D.-Y. Wu, W.-Y. Hsiao, F.-R. Yang, O. Friedman, W. Jackson, S. Bruzenak, Y.-W. Liu, and Y.-H. Yang, “Ddsp-based singing vocoders: A new subtractive-based synthesizer and a comprehensive evaluation,” arXiv preprint arXiv:2208.04756, 2022.
  225. D. Yang, J. Yu, H. Wang, W. Wang, C. Weng, Y. Zou, and D. Yu, “Diffsound: Discrete diffusion model for text-to-sound generation,” arXiv preprint arXiv:2207.09983, 2022.
  226. Y. Leng, Z. Chen, J. Guo, H. Liu, J. Chen, X. Tan, D. Mandic, L. He, X.-Y. Li, T. Qin et al., “Binauralgrad: A two-stage conditional diffusion probabilistic model for binaural audio synthesis,” arXiv preprint arXiv:2205.14807, 2022.
  227. R. Scheibler, Y. Ji, S.-W. Chung, J. Byun, S. Choe, and M.-S. Choi, “Diffusion-based generative speech source separation,” arXiv preprint arXiv:2210.17327, 2022.
  228. S. Han, H. Ihm, D. Ahn, and W. Lim, “Instrument separation of symbolic music by explicitly guided diffusion model,” arXiv preprint arXiv:2209.02696, 2022.
  229. S. Liu, Y. Cao, D. Su, and H. Meng, “Diffsvc: A diffusion probabilistic model for singing voice conversion,” in IEEE ASRU, 2021.
  230. V. Popov, I. Vovk, V. Gogoryan, T. Sadekova, M. S. Kudinov, and J. Wei, “Diffusion-based voice conversion with fast maximum likelihood sampling scheme,” in ICLR, 2021.
  231. M. Jeong, H. Kim, S. J. Cheon, B. J. Choi, and N. S. Kim, “Diff-TTS: A Denoising Diffusion Model for Text-to-Speech,” in Proc. Interspeech 2021, 2021, pp. 3605–3609.
  232. H. Kim, S. Kim, and S. Yoon, “Guided-tts: A diffusion model for text-to-speech via classifier guidance,” in ICML.   PMLR, 2022, pp. 11 119–11 133.
  233. S. Kim, H. Kim, and S. Yoon, “Guided-tts 2: A diffusion model for high-quality adaptive text-to-speech with untranscribed data,” arXiv preprint arXiv:2205.15370, 2022.
  234. A. Levkovitch, E. Nachmani, and L. Wolf, “Zero-shot voice conditioning for denoising diffusion tts models,” arXiv preprint arXiv:2206.02246, 2022.
  235. Y. Koizumi, H. Zen, K. Yatabe, N. Chen, and M. Bacchiani, “Specgrad: Diffusion probabilistic model based neural vocoder with adaptive noise spectral shaping,” arXiv preprint arXiv:2203.16749, 2022.
  236. S. Wu and Z. Shi, “Itôtts and itôwave: Linear stochastic differential equation is all you need for audio generation,” arXiv e-prints, pp. arXiv–2105, 2021.
  237. Z. Chen, X. Tan, K. Wang, S. Pan, D. Mandic, L. He, and S. Zhao, “Infergrad: Improving diffusion models for vocoder by considering inference in training,” in ICASSP.   IEEE, 2022, pp. 8432–8436.
  238. C. Shi, S. Luo, M. Xu, and J. Tang, “Learning gradient fields for molecular conformation generation,” in ICML.   PMLR, 2021, pp. 9558–9568.
  239. E. Hoogeboom, V. G. Satorras, C. Vignac, and M. Welling, “Equivariant diffusion for molecule generation in 3d,” in ICML.   PMLR, 2022, pp. 8867–8887.
  240. T. Xie, X. Fu, O.-E. Ganea, R. Barzilay, and T. S. Jaakkola, “Crystal diffusion variational autoencoder for periodic material generation,” in ICLR, 2021.
  241. J. S. Lee and P. M. Kim, “Proteinsgm: Score-based generative modeling for de novo protein design,” bioRxiv, 2022.
  242. G. Corso, H. Stärk, B. Jing, R. Barzilay, and T. Jaakkola, “Diffdock: Diffusion steps, twists, and turns for molecular docking,” arXiv preprint arXiv:2210.01776, 2022.
  243. A. Schneuing, Y. Du, C. Harris, A. Jamasb, I. Igashov, W. Du, T. Blundell, P. Lió, C. Gomes, M. Welling et al., “Structure-based drug design with equivariant diffusion models,” arXiv preprint arXiv:2210.13695, 2022.
  244. C. Shi, C. Wang, J. Lu, B. Zhong, and J. Tang, “Protein sequence and structure co-design with equivariant translation,” arXiv preprint arXiv:2210.08761, 2022.
  245. I. Igashov, H. Stärk, C. Vignac, V. G. Satorras, P. Frossard, M. Welling, M. Bronstein, and B. Correia, “Equivariant 3d-conditional diffusion models for molecular linker design,” arXiv preprint arXiv:2210.05274, 2022.
  246. K. E. Wu, K. K. Yang, R. v. d. Berg, J. Y. Zou, A. X. Lu, and A. P. Amini, “Protein structure generation via folding diffusion,” arXiv preprint arXiv:2209.15611, 2022.
  247. B. L. Trippe, J. Yim, D. Tischer, T. Broderick, D. Baker, R. Barzilay, and T. Jaakkola, “Diffusion probabilistic modeling of protein backbones in 3d for the motif-scaffolding problem,” arXiv preprint arXiv:2206.04119, 2022.
  248. W. Jin, J. Wohlwend, R. Barzilay, and T. S. Jaakkola, “Iterative refinement graph neural network for antibody sequence-structure co-design,” in ICLR, 2021.
  249. T. Fu and J. Sun, “Antibody complementarity determining regions (cdrs) design using constrained energy model,” in SIGKDD, 2022, pp. 389–399.
  250. W. Jin, R. Barzilay, and T. Jaakkola, “Antibody-antigen docking and design via hierarchical structure refinement,” in ICML.   PMLR, 2022, pp. 10 217–10 227.
  251. A. Borji, “Pros and cons of gan evaluation measures: New developments,” Computer Vision and Image Understanding, vol. 215, p. 103329, 2022.
  252. T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen, “Improved techniques for training gans,” NIPS, vol. 29, 2016.
  253. M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” in NIPS.   Curran Associates, Inc.
  254. A. Razavi, A. Van den Oord, and O. Vinyals, “Generating diverse high-fidelity images with vq-vae-2,” NIPS, vol. 32, 2019.
  255. T. M. Nguyen, A. Garg, R. G. Baraniuk, and A. Anandkumar, “Infocnf: Efficient conditional continuous normalizing flow using adaptive solvers,” 2019.
  256. Z. Ziegler and A. Rush, “Latent normalizing flows for discrete sequences,” in ICML.   PMLR, 2019, pp. 7673–7682.
  257. J. Tomczak and M. Welling, “Vae with a vampprior,” in AISTATS.   PMLR, 2018, pp. 1214–1223.
  258. O. Rybkin, K. Daniilidis, and S. Levine, “Simple and effective vae training with calibrated decoders,” in ICML.   PMLR, 2021, pp. 9179–9189.
  259. A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” 2009.
  260. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017.
  261. Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in ICCV, December 2015.
  262. F. Yu, A. Seff, Y. Zhang, S. Song, T. Funkhouser, and J. Xiao, “Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop,” arXiv preprint arXiv:1506.03365, 2015.
  263. T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” in CVPR, June 2019.
  264. Y. LeCun and C. Cortes, “MNIST handwritten digit database.”
  265. H. Chung, B. Sim, D. Ryu, and J. C. Ye, “Improving diffusion models for inverse problems using manifold constraints,” arXiv.
  266. Y. Song, S. Garg, J. Shi, and S. Ermon, “Sliced score matching: A scalable approach to density and score estimation,” in Uncertainty in Artificial Intelligence, 2020.
  267. Y. Song and S. Ermon, “Improved techniques for training score-based generative models,” NIPS, 2020.
  268. Q. Zhang and Y. Chen, “Diffusion normalizing flow,” NIPS.
  269. R. Gao, Y. Song, B. Poole, Y. N. Wu, and D. P. Kingma, “Learning energy-based models by diffusion recovery likelihood,” arXiv preprint arXiv:2012.08125, 2020.
  270. Y. Song and D. P. Kingma, “How to train your energy-based models,” arXiv preprint arXiv:2101.03288, 2021.
  271. V. De Bortoli, A. Doucet, J. Heng, and J. Thornton, “Simulating diffusion bridges with score matching,” arXiv preprint arXiv:2111.07243, 2021.
  272. L. Zhou, Y. Du, and J. Wu, “3d shape generation and completion through point-voxel diffusion,” in ICCV, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Hanqun Cao (7 papers)
  2. Cheng Tan (140 papers)
  3. Zhangyang Gao (58 papers)
  4. Yilun Xu (52 papers)
  5. Guangyong Chen (55 papers)
  6. Pheng-Ann Heng (196 papers)
  7. Stan Z. Li (222 papers)
Citations (122)
Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com

HackerNews