Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Diffusion Models: A Comprehensive Survey of Methods and Applications (2209.00796v14)

Published 2 Sep 2022 in cs.LG, cs.AI, and cs.CV
Diffusion Models: A Comprehensive Survey of Methods and Applications

Abstract: Diffusion models have emerged as a powerful new family of deep generative models with record-breaking performance in many applications, including image synthesis, video generation, and molecule design. In this survey, we provide an overview of the rapidly expanding body of work on diffusion models, categorizing the research into three key areas: efficient sampling, improved likelihood estimation, and handling data with special structures. We also discuss the potential for combining diffusion models with other generative models for enhanced results. We further review the wide-ranging applications of diffusion models in fields spanning from computer vision, natural language generation, temporal data modeling, to interdisciplinary applications in other scientific disciplines. This survey aims to provide a contextualized, in-depth look at the state of diffusion models, identifying the key areas of focus and pointing to potential areas for further exploration. Github: https://github.com/YangLing0818/Diffusion-Models-Papers-Survey-Taxonomy.

Diffusion Models: A Comprehensive Survey of Methods and Applications

Diffusion models have established themselves as a significant advancement in the landscape of deep generative models, rivaling the previously dominant Generative Adversarial Networks (GANs) in tasks like image synthesis, video generation, and molecule design. The paper "Diffusion Models: A Comprehensive Survey of Methods and Applications" offers an extensive survey of current research, aiming to categorize the rapidly expanding body of work into key areas and review the extensive range of diffusion model applications.

Foundational Framework

The paper begins by providing a structured introduction to the foundations of diffusion models. It details three principal formulations: Denoising Diffusion Probabilistic Models (DDPMs), Score-Based Generative Models (SGMs), and Stochastic Differential Equations (Score SDEs). Each model employs a specific mechanism to progressively transform data into noise and subsequently reverse this noise back into new data samples.

  1. DDPMs: These models utilize a Markov chain where data is progressively perturbed by Gaussian noise, and a learnable reverse process then denoises the data back to its original form.
  2. SGMs: Central to these models is the concept of score function, defined as the gradient of the log probability density. They perturb data with Gaussian noise and estimate score functions at different noise levels.
  3. Score SDEs: Incorporating both finite and infinite time steps in their formulations, Score SDEs generalize DDPMs and SGMs using differential equations to define forward and reverse diffusion processes.

Efficient Sampling

One of the significant challenges in leveraging diffusion models is the computational intensity involved in the iterative sampling process. Recent advancements aim to enhance sampling efficiency without compromising quality.

  1. Learning-Free Sampling: This includes improved discretization schemes for SDEs and ODEs, such as Heun's method and predictor-corrector strategies, which balance the trade-off between sampling speed and accuracy.
  2. Learning-Based Sampling: Techniques such as optimized discretization of time steps, truncated diffusion processes, and knowledge distillation are designed to reduce the number of sampling steps while maintaining or enhancing sample quality.

Improved Likelihood Estimation

Diffusion models traditionally depend on a variational lower bound (VLB) for likelihood estimation. Enhancing this estimation is crucial for better performance.

  1. Noise Schedule Optimization: By optimizing the noise schedules in the forward process, models can better maximize the VLB, leading to higher log-likelihood values.
  2. Reverse Variance Learning: Learning the variance parameters in the reverse process rather than using fixed values can yield more accurate data probabilities.
  3. Exact Likelihood Computation: Methods such as integrating Score SDEs with advanced numerical solvers enable more precise calculation and maximization of the data likelihood.

Handling Special Structures

Given the varied nature of data, diffusion models have been adapted to address data with specific structures, including discrete data, invariant properties, and manifold structures.

  1. Discrete Data: Techniques such as random walk transition kernels for discrete spaces and generalizations of score functions extend diffusion models to handle discrete datasets efficiently.
  2. Invariant Structures: Models like GDSS leverage permutation invariance for graph data, while others guarantee translation and rotation invariance for molecular data.
  3. Manifold Structures: Extending diffusion models to Riemannian manifolds and employing autoencoders to learn latent manifolds are key to making diffusion models applicable to a broader range of data modalities.

Connections with Other Generative Models

Diffusion models have shown potential for integration with other generative models, enhancing their application scope and performance.

  1. VAEs: Integrating diffusion models with VAEs allows for better representation learning and sampling efficiency.
  2. GANs: Diffusion models can stabilize GAN training and improve sampling quality by introducing noise schedules.
  3. Normalizing Flows: Combining these models with diffusion processes enables the generation of complex data distributions with fewer steps.

Applications Across Domains

The versatility of diffusion models is highlighted through their applications in various domains:

  1. Computer Vision: Tasks such as image super-resolution, inpainting, and translation benefit from diffusion models' ability to generate high-quality images.
  2. Natural Language Processing: Text generation and conditional text synthesis are areas where diffusion models have shown significant promise.
  3. Temporal Data Modeling: Imputation and forecasting of time series data have seen enhanced accuracy with diffusion-based approaches.
  4. Multi-Modal Learning: Applications such as text-to-image and text-to-video generation leverage the flexibility of diffusion models for creating complex, conditionally generated content.
  5. Robust Learning: Diffusion models contribute to the development of robust learning algorithms, capable of handling adversarial noise.
  6. Interdisciplinary Applications: In fields such as computational chemistry and medical imaging, diffusion models facilitate tasks like molecule design and image reconstruction with high fidelity.

Future Directions

The paper concludes by outlining potential research directions, including revisiting and analyzing typical diffusion model assumptions, deepening theoretical understanding, and exploring latent representations more effectively. Additionally, the potential of diffusion foundation models and their applications in Artificial Intelligence Generated Content (AIGC) highlight promising areas for future exploration.

In summary, diffusion models are a dynamic and rapidly evolving area in deep generative modeling, promising high-quality, diverse, and controllable data generation across various domains. The surveyed methodologies and applications provide a comprehensive understanding of current advancements and future research potentials in this exciting field.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (340)
  1. Juan Miguel Lopez Alcaraz and Nils Strodthoff. 2022. Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models. arXiv preprint arXiv:2208.09399 (2022).
  2. Segdiff: Image segmentation with diffusion probabilistic models. arXiv preprint arXiv:2112.00390 (2021).
  3. Namrata Anand and Tudor Achim. 2022. Protein Structure and Sequence Generation with Equivariant Denoising Diffusion Probabilistic Models. arXiv preprint arXiv:2205.15019 (2022).
  4. Brian DO Anderson. 1982. Reverse-time diffusion equation models. Stochastic Processes and their Applications 12, 3 (1982), 313–326.
  5. Uri M Ascher and Linda R Petzold. 1998. Computer methods for ordinary differential equations and differential-algebraic equations. Vol. 61. Siam.
  6. Structured denoising diffusion models in discrete state-spaces. In Advances in Neural Information Processing Systems.
  7. Blended diffusion for text-driven editing of natural images. In IEEE Conference on Computer Vision and Pattern Recognition. 18208–18218.
  8. Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models. In International Conference on Learning Representations.
  9. One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale. arXiv preprint arXiv:2303.06555 (2023).
  10. Label-Efficient Semantic Segmentation with Diffusion Models. In International Conference on Learning Representations.
  11. Conditional image generation with score-based diffusion models. arXiv preprint arXiv:2111.13606 (2021).
  12. Samy Bengio and Yoshua Bengio. 2000. Taking on the curse of dimensionality in joint distributions using neural networks. IEEE Trans. Neural Networks Learn. Syst. (2000).
  13. A neural probabilistic language model. The journal of machine learning research 3 (2003), 1137–1155.
  14. The protein data bank. Nucleic acids research 28, 1 (2000), 235–242.
  15. Graph Barlow Twins: A self-supervised representation learning framework for graphs. arXiv preprint arXiv:2106.02466 (2021).
  16. Demystifying MMD GANs. In International Conference on Learning Representations.
  17. Threat Model-Agnostic Adversarial Defense using Diffusion Models. arXiv preprint arXiv:2207.08089 (2022).
  18. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258 (2021).
  19. Denoising Pretraining for Semantic Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition. 4175–4186.
  20. Language models are few-shot learners. In Advances in Neural Information Processing Systems.
  21. Machine learning for molecular and materials science. Nature 559, 7715 (2018), 547–555.
  22. Learning gradient fields for shape generation. In European Conference on Computer Vision. Springer, 364–381.
  23. A Continuous Time Framework for Discrete Denoising Models. arXiv preprint arXiv:2205.14987 (2022).
  24. High-Frequency Space Diffusion Models for Accelerated MRI. arXiv preprint arXiv:2208.05481 (2022).
  25. Brits: Bidirectional recurrent imputation for time series. In Advances in Neural Information Processing Systems, Vol. 31.
  26. (Certified!!) Adversarial Robustness for Free! arXiv preprint arXiv:2206.10550 (2022).
  27. Maskgit: Masked generative image transformer. In IEEE Conference on Computer Vision and Pattern Recognition. 11315–11325.
  28. Your GAN is Secretly an Energy-based Model and You Should use Discriminator Driven Latent Sampling. arXiv preprint arXiv:2003.06060 (2020).
  29. Recurrent neural networks for multivariate time series with missing values. Scientific reports 8, 1 (2018), 1–12.
  30. One billion word benchmark for measuring progress in statistical language modeling. arXiv preprint arXiv:1312.3005 (2013).
  31. WaveGrad: Estimating gradients for waveform generation. arXiv preprint arXiv:2009.00713 (2020).
  32. Neural ordinary differential equations. arXiv preprint arXiv:1806.07366 (2018).
  33. Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory. In International Conference on Learning Representations.
  34. Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning. arXiv preprint arXiv:2208.04202 (2022).
  35. Rewon Child. 2020. Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images. In International Conference on Learning Representations.
  36. Generating Long Sequences with Sparse Transformers. CoRR abs/1904.10509 (2019). arXiv:1904.10509 http://arxiv.org/abs/1904.10509
  37. PaLM: Scaling Language Modeling with Pathways. (2022).
  38. MR Image Denoising and Super-Resolution Using Regularized Reverse Diffusion. arXiv preprint arXiv:2203.12621 (2022).
  39. Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In IEEE Conference on Computer Vision and Pattern Recognition. 12413–12422.
  40. Hyungjin Chung and Jong Chul Ye. 2022. Score-based diffusion models for accelerated MRI. Medical Image Analysis (2022), 102479.
  41. Relaxing bijectivity constraints with continuously indexed normalising flows. In International Conference on Machine Learning. 2133–2143.
  42. Generative adversarial networks: An overview. IEEE signal processing magazine 35, 1 (2018), 53–65.
  43. Vqgan-clip: Open domain image generation and editing with natural language guidance. arXiv preprint arXiv:2204.08583 (2022).
  44. Adaptive Diffusion Priors for Accelerated MRI Reconstruction. arXiv preprint arXiv:2207.05876 (2022).
  45. Plug and Play Language Models: A Simple Approach to Controlled Text Generation. In International Conference on Learning Representations.
  46. Simulating diffusion bridges with score matching. arXiv preprint arXiv:2111.07243 (2021).
  47. Riemannian score-based generative modeling. arXiv preprint arXiv:2202.02763 (2022).
  48. Diffusion Schrödinger bridge with applications to score-based generative modeling. In Advances in Neural Information Processing Systems, Vol. 34. 17695–17709.
  49. Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition. 248–255.
  50. On tracking the partition function. In Advances in Neural Information Processing Systems. 2501–2509.
  51. Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. In Advances in Neural Information Processing Systems, Vol. 34. 8780–8794.
  52. Continuous diffusion for categorical data. arXiv preprint arXiv:2211.15089 (2022).
  53. Nice: Non-linear independent components estimation. ICLR 2015 Workshop Track (2015).
  54. Density estimation using real nvp. arXiv preprint arXiv:1605.08803 (2016).
  55. Density estimation using Real NVP. In International Conference on Learning Representations. https://openreview.net/forum?id=HkpbnH9lx
  56. A RAD approach to deep mixture models. arXiv preprint arXiv:1903.07714 (2019).
  57. Score-Based Generative Modeling with Critically-Damped Langevin Diffusion. In International Conference on Learning Representations.
  58. GENIE: Higher-Order Denoising Diffusion Solvers. Advances in Neural Information Processing Systems (2022).
  59. Carl Doersch. 2016. Tutorial on variational autoencoders. arXiv preprint arXiv:1606.05908 (2016).
  60. A survey of vision-language pre-trained models. arXiv preprint arXiv:2202.10936 (2022).
  61. Yilun Du and Igor Mordatch. 2019. Implicit generation and generalization in energy-based models. arXiv preprint arXiv:1903.08689 (2019).
  62. Convolutional networks on graphs for learning molecular fingerprints. In Advances in Neural Information Processing Systems, Vol. 28.
  63. Time-Series Representation Learning via Temporal and Contextual Contrasting. arXiv preprint arXiv:2106.14112 (2021).
  64. Taming transformers for high-resolution image synthesis. In IEEE Conference on Computer Vision and Pattern Recognition. 12873–12883.
  65. Frido: Feature Pyramid Diffusion for Complex Scene Image Synthesis. arXiv preprint arXiv:2208.13753 (2022).
  66. Testing the manifold hypothesis. Journal of the American Mathematical Society 29, 4 (2016), 983–1049.
  67. A connection between generative adversarial networks, inverse reinforcement learning, and energy-based models. arXiv preprint arXiv:1611.03852 (2016).
  68. Gp-vae: Deep probabilistic time series imputation. In International conference on artificial intelligence and statistics. PMLR, 1651–1661.
  69. How much is enough? a study on diffusion times in score-based generative models. arXiv preprint arXiv:2206.05173 (2022).
  70. Learning generative convnets via multi-grid modeling and sampling. In IEEE Conference on Computer Vision and Pattern Recognition. 9155–9164.
  71. Flow contrastive estimation of energy-based models. In IEEE Conference on Computer Vision and Pattern Recognition. 7518–7528.
  72. Learning energy-based models by diffusion recovery likelihood. arXiv preprint arXiv:2012.08125 (2020).
  73. Remote Sensing Change Detection (Segmentation) using Denoising Diffusion Probabilistic Models. arXiv e-prints (2022), arXiv–2206.
  74. Neural message passing for quantum chemistry. In International Conference on Machine Learning. 1263–1272.
  75. Sequence to sequence text generation with diffusion models. In International Conference on Learning Representations.
  76. Wenbo Gong and Yingzhen Li. 2021. Interpreting diffusion score matching using normalizing flow. arXiv preprint arXiv:2107.10072 (2021).
  77. Generative adversarial nets. In Advances in Neural Information Processing Systems, Vol. 27. 139–144.
  78. A new model for learning in graph domains. In Proceedings. 2005 IEEE international joint conference on neural networks, Vol. 2. 729–734.
  79. Variational walkback: Learning a transition operator as a stochastic recurrent net. In Advances in Neural Information Processing Systems. 4392–4402.
  80. Diffusion models as plug-and-play priors. In Advances in Neural Information Processing Systems.
  81. Scalable Reversible Generative Models with Free-form Continuous Dynamics. In International Conference on Learning Representations. https://openreview.net/forum?id=rJxgknCcK7
  82. Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One. arXiv preprint arXiv:1912.03263 (2019).
  83. Cutting out the Middle-Man: Training and Evaluating Energy-Based Models without Sampling. arXiv preprint arXiv:2002.05616 (2020).
  84. Alex Graves. 2013. Generating Sequences With Recurrent Neural Networks. CoRR abs/1308.0850 (2013). arXiv:1308.0850 http://arxiv.org/abs/1308.0850
  85. Ulf Grenander and Michael I Miller. 1994. Representations of knowledge in complex systems. Journal of the Royal Statistical Society: Series B (Methodological) 56, 4 (1994), 549–581.
  86. Efficiently Modeling Long Sequences with Structured State Spaces. In International Conference on Learning Representations.
  87. Vector quantized diffusion model for text-to-image synthesis. In IEEE Conference on Computer Vision and Pattern Recognition. 10696–10706.
  88. 3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction. In International Conference on Learning Representations.
  89. A review on generative adversarial networks: Algorithms, theory, and applications. IEEE Transactions on Knowledge and Data Engineering (2021).
  90. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025–1035.
  91. Representation learning on graphs: Methods and applications. arXiv preprint arXiv:1709.05584 (2017).
  92. ADBench: Anomaly Detection Benchmark. arXiv preprint arXiv:2206.09426 (2022).
  93. Ssd-lm: Semi-autoregressive simplex-based diffusion language model for text generation and modular control. arXiv preprint arXiv:2210.17432 (2022).
  94. Flexible Diffusion Modeling of Long Videos. arXiv preprint arXiv:2205.11495 (2022).
  95. Learning canonical representations for scene graph to image generation. In European Conference on Computer Vision. 210–227.
  96. Imagen video: High definition video generation with diffusion models. arXiv preprint arXiv:2210.02303 (2022).
  97. Denoising diffusion probabilistic models. In Advances in Neural Information Processing Systems, Vol. 33. 6840–6851.
  98. Cascaded Diffusion Models for High Fidelity Image Generation. J. Mach. Learn. Res. 23 (2022), 47–1.
  99. Jonathan Ho and Tim Salimans. 2022. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022).
  100. Video diffusion models. arXiv preprint arXiv:2204.03458 (2022).
  101. Equivariant Diffusion for Molecule Generation in 3D. arXiv e-prints (2022), arXiv–2203.
  102. Autoregressive Diffusion Models. In International Conference on Learning Representations.
  103. Argmax flows and multinomial diffusion: Learning categorical distributions. In Advances in Neural Information Processing Systems, Vol. 34. 12454–12465.
  104. Riemannian Diffusion Models.
  105. A variational perspective on diffusion-based generative models and score matching. In Advances in Neural Information Processing Systems, Vol. 34. 22863–22876.
  106. ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-Speech. arXiv preprint arXiv:2207.06389 (2022).
  107. Michael F Hutchinson. 1989. A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Communications in Statistics-Simulation and Computation 18, 3 (1989), 1059–1076.
  108. Aapo Hyvärinen. 2005. Estimation of Non-Normalized Statistical Models by Score Matching. J. Mach. Learn. Res. 6 (2005), 695–709.
  109. Touseef Iqbal and Shaima Qureshi. 2020. The survey: Text generation models in deep learning. Journal of King Saud University-Computer and Information Sciences (2020).
  110. Image-to-image translation with conditional adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition. 1125–1134.
  111. Introspective classification with convolutional nets. In Advances in Neural Information Processing Systems, Vol. 30. 823–833.
  112. Junction tree variational autoencoder for molecular graph generation. In International Conference on Machine Learning. 2323–2332.
  113. Subspace diffusion generative models. arXiv preprint arXiv:2205.01490 (2022).
  114. Torsional Diffusion for Molecular Conformer Generation. arXiv preprint arXiv:2206.01729 (2022).
  115. Score-based generative modeling of graphs via the system of stochastic differential equations. In International Conference on Machine Learning. PMLR, 10362–10383.
  116. Image generation from scene graphs. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1219–1228.
  117. Gotta Go Fast When Generating Data with Score-Based Models. (2021).
  118. Adversarial score matching and improved sampling for image generation. ArXiv abs/2009.05475 (2021).
  119. Highly accurate protein structure prediction with AlphaFold. Nature 596, 7873 (2021), 583–589.
  120. Heewoo Jun and Alex Nichol. 2023. Shap-e: Generating conditional 3d implicit functions. arXiv preprint arXiv:2305.02463 (2023).
  121. Efficient Neural Audio Synthesis. In International Conference on Machine Learning. 2410–2419.
  122. Elucidating the Design Space of Diffusion-Based Generative Models. arXiv preprint arXiv:2206.00364 (2022).
  123. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4401–4410.
  124. Denoising diffusion restoration models. arXiv preprint arXiv:2201.11793 (2022).
  125. Enhancing diffusion-based image synthesis with robust classifier guidance. arXiv preprint arXiv:2208.08664 (2022).
  126. Stochastic image denoising by sampling from the posterior distribution. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1866–1875.
  127. Imagic: Text-Based Real Image Editing with Diffusion Models. arXiv preprint arXiv:2210.09276 (2022).
  128. Ctrl: A conditional transformer language model for controllable generation. arXiv preprint arXiv:1909.05858 (2019).
  129. Text2video-zero: Text-to-image diffusion models are zero-shot video generators. arXiv preprint arXiv:2303.13439 (2023).
  130. Diffusemorph: Unsupervised deformable image registration along continuous trajectory using diffusion models. arXiv preprint arXiv:2112.05149 (2021).
  131. Maximum Likelihood Training of Implicit Nonlinear Diffusion Model. In Advances in Neural Information Processing Systems.
  132. Flame: Free-form language-based motion synthesis & editing. arXiv preprint arXiv:2209.00349 (2022).
  133. Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data. arXiv preprint arXiv:2205.15370 (2022).
  134. Taesup Kim and Yoshua Bengio. 2016. Deep directed generative models with energy-based probability estimation. arXiv preprint arXiv:1606.03439 (2016).
  135. Variational diffusion models. In Advances in Neural Information Processing Systems, Vol. 34. 21696–21707.
  136. Diederik P Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1x1 convolutions. arXiv preprint arXiv:1807.03039 (2018).
  137. Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
  138. An introduction to variational autoencoders. Foundations and Trends® in Machine Learning 12, 4 (2019), 307–392.
  139. Daphne Koller and Nir Friedman. 2009. Probabilistic graphical models: principles and techniques. MIT press.
  140. Diffwave: A versatile diffusion model for audio synthesis. arXiv preprint arXiv:2009.09761 (2020).
  141. Gedi: Generative discriminator guided sequence generation. arXiv preprint arXiv:2009.06367 (2020).
  142. Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. (2009).
  143. Maximum Entropy Generators for Energy-Based Models. arXiv preprint arXiv:1901.08508 (2019).
  144. Hugo Larochelle and Iain Murray. 2011. The Neural Autoregressive Distribution Estimator. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS.
  145. Introspective neural networks for generative modeling. In Proceedings of the IEEE International Conference on Computer Vision. 2774–2783.
  146. A tutorial on energy-based learning. Predicting structured data 1, 0 (2006).
  147. Jin Sub Lee and Philip M Kim. 2022. ProteinSGM: Score-based generative modeling for de novo protein design. bioRxiv (2022).
  148. Wasserstein introspective neural networks. In IEEE Conference on Computer Vision and Pattern Recognition. 3702–3711.
  149. Exploring Chemical Space with Score-based Out-of-distribution Generation. arXiv preprint arXiv:2206.07632 (2022).
  150. Zero-Shot Voice Conditioning for Denoising Diffusion TTS Models. arXiv preprint arXiv:2206.02246 (2022).
  151. SRDiff: Single Image Super-Resolution with Diffusion Probabilistic Models. Neurocomputing 479 (2022), 47–59.
  152. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. In International Conference on Machine Learning. PMLR, 12888–12900.
  153. Textbox: A unified, modularized, and extensible framework for text generation. arXiv preprint arXiv:2101.02046 (2021).
  154. Pretrained language models for text generation: A survey. arXiv preprint arXiv:2105.10311 (2021).
  155. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. In Advances in Neural Information Processing Systems, Vol. 32.
  156. Diffusion-LM Improves Controllable Text Generation. arXiv preprint arXiv:2205.14217 (2022).
  157. Pastegan: A semi-parametric method to generate image from scene graph. Advances in Neural Information Processing Systems 32 (2019).
  158. Magic3D: High-Resolution Text-to-3D Content Creation. arXiv preprint arXiv:2211.10440 (2022).
  159. Pseudo Numerical Methods for Diffusion Models on Manifolds. In International Conference on Learning Representations.
  160. Molecular geometry pretraining with se (3)-invariant denoising distance matching. In International Conference on Learning Representations.
  161. Learning Diffusion Bridges on Constrained Domains. In International Conference on Learning Representations.
  162. Let us Build Bridges: Understanding and Extending Diffusion Generative Models. arXiv preprint arXiv:2208.14699 (2022).
  163. Neural manifold ordinary differential equations. Advances in Neural Information Processing Systems 33 (2020), 17548–17558.
  164. Maximum Likelihood Training for Score-based Diffusion ODEs by High Order Denoising Score Matching. In International Conference on Machine Learning. 14429–14460.
  165. DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps. arXiv preprint arXiv:2206.00927 (2022).
  166. Repaint: Inpainting using denoising diffusion probabilistic models. In IEEE Conference on Computer Vision and Pattern Recognition. 11461–11471.
  167. Eric Luhman and Troy Luhman. 2021. Knowledge distillation in iterative generative models for improved sampling speed. arXiv preprint arXiv:2101.02388 (2021).
  168. Calvin Luo. 2022. Understanding Diffusion Models: A Unified Perspective. arXiv preprint arXiv:2208.11970 (2022).
  169. One transformer can understand both 2d & 3d molecular data. In International Conference on Learning Representations.
  170. Shitong Luo and Wei Hu. 2021a. Diffusion probabilistic models for 3d point cloud generation. In IEEE Conference on Computer Vision and Pattern Recognition. 2837–2845.
  171. Shitong Luo and Wei Hu. 2021b. Score-based point cloud denoising. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4583–4592.
  172. Predicting molecular conformation via dynamic graph score matching. In Advances in Neural Information Processing Systems, Vol. 34. 19784–19795.
  173. Antigen-specific antibody design and optimization with diffusion-based generative models. bioRxiv (2022).
  174. Multivariate time series imputation with generative adversarial networks. In Advances in Neural Information Processing Systems, Vol. 31.
  175. A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion. In International Conference on Learning Representations.
  176. Accelerating Diffusion Models via Early Stop of the Diffusion Process. arXiv preprint arXiv:2205.12524 (2022).
  177. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations.
  178. Emile Mathieu and Maximilian Nickel. 2020. Riemannian continuous normalizing flows. Advances in Neural Information Processing Systems 33 (2020), 2503–2515.
  179. Metal Inpainting in CBCT Projections Using Score-based Generative Model. arXiv preprint arXiv:2209.09733 (2022).
  180. On the State of the Art of Evaluation in Neural Language Models. In International Conference on Learning Representations. https://openreview.net/forum?id=ByJHuTgA-
  181. Concrete Score Matching: Generalized Score Matching for Discrete Data. In Advances in Neural Information Processing Systems.
  182. On Distillation of Guided Diffusion Models. In NeurIPS 2022 Workshop on Score-Based Methods.
  183. Sdedit: Guided image synthesis and editing with stochastic differential equations. In International Conference on Learning Representations.
  184. Improved Autoregressive Modeling with Distribution Smoothing. In International Conference on Learning Representations.
  185. Estimating high order gradients of the data distribution by denoising. Advances in Neural Information Processing Systems 34 (2021), 25359–25369.
  186. Autoregressive score matching. Advances in Neural Information Processing Systems 33 (2020), 6673–6683.
  187. Regularizing and Optimizing LSTM Language Models. In International Conference on Learning Representations. https://openreview.net/forum?id=SyyGPP0TZ
  188. Nicholas Metropolis and Stanislaw Ulam. 1949. The monte carlo method. Journal of the American statistical association 44, 247 (1949), 335–341.
  189. Learning deep energy models. In International Conference on Machine Learning. 1105–1112.
  190. Alexander Quinn Nichol and Prafulla Dhariwal. 2021. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning. 8162–8171.
  191. GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models. In International Conference on Machine Learning. 16784–16804.
  192. Diffusion Models for Adversarial Purification. arXiv preprint arXiv:2205.07460 (2022).
  193. On the anatomy of mcmc-based maximum likelihood learning of energy-based models. arXiv preprint arXiv:1903.12370 (2019).
  194. On Learning Non-Convergent Short-Run MCMC Toward Energy-Based Model. arXiv preprint arXiv:1904.09770 (2019).
  195. Permutation invariant graph generation via score-based generative modeling. In International Conference on Artificial Intelligence and Statistics. PMLR, 4474–4484.
  196. OpenAI. 2023. GPT-4 Technical Report. arXiv preprint arXiv:2303.08774 (2023).
  197. N-BEATS: Neural basis expansion analysis for interpretable time series forecasting. In International Conference on Learning Representations.
  198. Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems.
  199. Unsupervised Medical Image Translation with Adversarial Diffusion Models. arXiv preprint arXiv:2207.08208 (2022).
  200. Normalizing Flows for Probabilistic Modeling and Inference. J. Mach. Learn. Res. 22, 57 (2021), 1–64.
  201. Giorgio Parisi. 1981. Correlation functions and computer simulations. Nuclear Physics B 180, 3 (1981), 378–384.
  202. Neural Markov Controlled SDE: Stochastic Optimization for Continuous-Time Data. In International Conference on Learning Representations.
  203. William Peebles and Saining Xie. 2022. Scalable Diffusion Models with Transformers. arXiv preprint arXiv:2212.09748 (2022).
  204. Towards performant and reliable undersampled MR reconstruction via diffusion model sampling. In International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 623–633.
  205. Pocket2mol: Efficient molecular sampling based on 3d protein pockets. In International Conference on Machine Learning. PMLR, 17644–17655.
  206. Film: Visual reasoning with a general conditioning layer. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
  207. Adversarial latent autoencoders. In IEEE Conference on Computer Vision and Pattern Recognition. 14104–14113.
  208. Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988 (2022).
  209. Grad-tts: A diffusion probabilistic model for text-to-speech. In International Conference on Machine Learning. 8599–8608.
  210. Diffusion autoencoders: Toward a meaningful and decodable representation. In IEEE Conference on Computer Vision and Pattern Recognition. 10619–10629.
  211. FateZero: Fusing Attentions for Zero-shot Text-based Video Editing. arXiv preprint arXiv:2303.09535 (2023).
  212. Unbiased Contrastive Divergence Algorithm for Training Energy-Based Latent Variable Models. In International Conference on Learning Representations.
  213. Lawrence R Rabiner. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 2 (1989), 257–286.
  214. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning. 8748–8763.
  215. Improving language understanding by generative pre-training. (2018).
  216. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
  217. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 (2022).
  218. Zero-shot text-to-image generation. In International Conference on Machine Learning. 8821–8831.
  219. Martin Raphan and Eero P Simoncelli. 2007. Learning to be Bayesian without supervision. In Advances in neural information processing systems. 1145–1152.
  220. Martin Raphan and Eero P Simoncelli. 2011. Least squares estimation without priors or supervision. Neural computation 23, 2 (2011), 374–420.
  221. Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting. In International Conference on Machine Learning. 8857–8868.
  222. Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting. In International Conference on Machine Learning. 8857–8868.
  223. Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows. In International Conference on Learning Representations.
  224. Characterization and computation of local Nash equilibria in continuous games. In 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 917–924.
  225. Danilo Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In International Conference on Machine Learning. 1530–1538.
  226. Stochastic backpropagation and approximate inference in deep generative models. In International Conference on Machine Learning. 1278–1286.
  227. Telescoping Density-Ratio Estimation. In Advances in Neural Information Processing Systems, Vol. 33. 4905–4916.
  228. Oren Rippel and Ryan Prescott Adams. 2013. High-dimensional probability estimation with deep density models. arXiv preprint arXiv:1302.5125 (2013).
  229. High-resolution image synthesis with latent diffusion models. In IEEE Conference on Computer Vision and Pattern Recognition. 10684–10695.
  230. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. arXiv preprint arXiv:2208.12242 (2022).
  231. Palette: Image-to-image diffusion models. In Special Interest Group on Computer Graphics and Interactive Techniques Conference Proceedings. 1–10.
  232. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. arXiv preprint arXiv:2205.11487 (2022).
  233. Image super-resolution via iterative refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).
  234. Tim Salimans and Jonathan Ho. 2021a. Progressive Distillation for Fast Sampling of Diffusion Models. In International Conference on Learning Representations.
  235. Tim Salimans and Jonathan Ho. 2021b. Should EBMs model the energy or the score?. In Energy Based Models Workshop-International Conference on Learning Representations.
  236. High-dimensional multivariate forecasting with low-rank gaussian copula processes. In Advances in Neural Information Processing Systems, Vol. 32.
  237. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting 36, 3 (2020), 1181–1191.
  238. Step-unrolled Denoising Autoencoders for Text Generation. In International Conference on Learning Representations.
  239. The graph neural network model. IEEE transactions on neural networks 20, 1 (2008), 61–80.
  240. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. In International conference on information processing in medical imaging. Springer, 146–157.
  241. Structure-based drug design with equivariant diffusion models. arXiv preprint arXiv:2210.13695 (2022).
  242. Learning gradient fields for molecular conformation generation. In International Conference on Machine Learning. 9558–9568.
  243. Graphaf: a flow-based autoregressive model for molecular graph generation. arXiv preprint arXiv:2001.09382 (2020).
  244. Conditional simulation using diffusion Schrödinger bridges. arXiv preprint arXiv:2202.13460 (2022).
  245. Predicting in-hospital mortality of icu patients: The physionet/computing in cardiology challenge 2012. In 2012 Computing in Cardiology. IEEE, 245–248.
  246. Make-a-video: Text-to-video generation without text-video data. arXiv preprint arXiv:2209.14792 (2022).
  247. John Skilling. 1989. The eigenvalues of mega-dimensional matrices. In Maximum Entropy and Bayesian Methods. Springer, 455–466.
  248. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning. 2256–2265.
  249. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In International Conference on Machine Learning, Francis R. Bach and David M. Blei (Eds.). 2256–2265.
  250. Denoising Diffusion Implicit Models. In International Conference on Learning Representations.
  251. Ki-Ung Song. 2022. Applying Regularized Schrödinger-Bridge-Based Stochastic Process in Generative Modeling. arXiv preprint arXiv:2208.07131 (2022).
  252. Maximum likelihood training of score-based diffusion models. In Advances in Neural Information Processing Systems, Vol. 34. 1415–1428.
  253. Yang Song and Stefano Ermon. 2019. Generative modeling by estimating gradients of the data distribution. In Advances in Neural Information Processing Systems, Vol. 32.
  254. Yang Song and Stefano Ermon. 2020. Improved techniques for training score-based generative models. In Advances in Neural Information Processing Systems, Vol. 33. 12438–12448.
  255. Sliced Score Matching: A Scalable Approach to Density and Score Estimation. In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2019, Tel Aviv, Israel, July 22-25, 2019. 204. http://auai.org/uai2019/proceedings/papers/204.pdf
  256. Yang Song and Diederik P Kingma. 2021. How to train your energy-based models. arXiv preprint arXiv:2101.03288 (2021).
  257. Solving Inverse Problems in Medical Imaging with Score-Based Generative Models. In International Conference on Learning Representations.
  258. Score-Based Generative Modeling through Stochastic Differential Equations. In International Conference on Learning Representations.
  259. James C Spall. 2012. Stochastic optimization. In Handbook of computational statistics. Springer, 173–201.
  260. PointDP: Diffusion-driven Purification against Adversarial Attacks on 3D Point Cloud Recognition. arXiv preprint arXiv:2208.09801 (2022).
  261. EdiTTS: Score-based Editing for Controllable Text-to-Speech. arXiv preprint arXiv:2110.02584 (2021).
  262. A tensor-based method for missing traffic data completion. Transportation Research Part C: Emerging Technologies 28 (2013), 15–27.
  263. Make-it-3d: High-fidelity 3d creation from a single image with diffusion prior. arXiv preprint arXiv:2303.14184 (2023).
  264. CSDI: Conditional score-based diffusion models for probabilistic time series imputation. In Advances in Neural Information Processing Systems, Vol. 34. 24804–24816.
  265. Human motion diffusion model. arXiv preprint arXiv:2209.14916 (2022).
  266. Bootstrapped representation learning on graphs. arXiv preprint arXiv:2102.06514 (2021).
  267. A note on the evaluation of generative models. arXiv preprint arXiv:1511.01844 (2015).
  268. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
  269. Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem. In International Conference on Learning Representations.
  270. Score-based generative modeling in latent space. In Advances in Neural Information Processing Systems, Vol. 34. 11287–11302.
  271. UniTune: Text-Driven Image Editing by Fine Tuning an Image Generation Model on a Single Image. arXiv preprint arXiv:2210.09477 (2022).
  272. WaveNet: A Generative Model for Raw Audio. In The 9th ISCA Speech Synthesis Workshop.
  273. Pixel Recurrent Neural Networks. In International Conference on Machine Learning, Maria-Florina Balcan and Kilian Q. Weinberger (Eds.). 1747–1756.
  274. Pascal Vincent. 2011. A connection between score matching and denoising autoencoders. Neural computation 23, 7 (2011), 1661–1674.
  275. Extracting and composing robust features with denoising autoencoders. In International Conference on Machine Learning. 1096–1103.
  276. Guided Diffusion Model for Adversarial Purification. arXiv preprint arXiv:2205.14969 (2022).
  277. ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation. arXiv preprint arXiv:2305.16213 (2023).
  278. Diffusion-GAN: Training GANs with Diffusion. arXiv preprint arXiv:2206.02262 (2022).
  279. Learning fast samplers for diffusion models by differentiating through sample quality. In International Conference on Learning Representations.
  280. Learning to efficiently sample from diffusion probabilistic models. arXiv preprint arXiv:2106.03802 (2021).
  281. Antoine Wehenkel and Gilles Louppe. 2021. Diffusion priors in variational autoencoders. arXiv preprint arXiv:2106.15671 (2021).
  282. Emergent Abilities of Large Language Models. Transactions on Machine Learning Research (2022).
  283. Deblurring via stochastic refinement. In IEEE Conference on Computer Vision and Pattern Recognition. 16293–16303.
  284. Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models. arXiv preprint arXiv:2303.04671 (2023).
  285. Stochastic Normalizing Flows. In Advances in Neural Information Processing Systems, Vol. 33. 5933–5944.
  286. Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. arXiv preprint arXiv:2212.11565 (2022).
  287. Diffusion-based Molecule Generation with Informative Prior Bridges.
  288. Guided Diffusion Model for Adversarial Purification from Random Noise. arXiv preprint arXiv:2206.10875 (2022).
  289. Shoule Wu and Ziqiang Shi. 2021. ItôTTS and ItôWave: Linear Stochastic Differential Equation Is All You Need For Audio Generation. arXiv e-prints (2021), arXiv–2105.
  290. Graph neural networks in recommender systems: a survey. ACM Computing Surveys (CSUR) (2020).
  291. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4–24.
  292. AnoDDPM: Anomaly Detection With Denoising Diffusion Probabilistic Models Using Simplex Noise. In IEEE Conference on Computer Vision and Pattern Recognition. 650–656.
  293. Tackling the generative learning trilemma with denoising diffusion gans. arXiv preprint arXiv:2112.07804 (2021).
  294. A theory of generative convnet. In International Conference on Machine Learning. 2635–2644.
  295. Vector Quantized Diffusion Model with CodeUnet for Text-to-Sign Pose Sequences Generation. arXiv preprint arXiv:2208.09141 (2022).
  296. Crystal Diffusion Variational Autoencoder for Periodic Material Generation. In International Conference on Learning Representations.
  297. Yutong Xie and Quanzheng Li. 2022. Measurement-conditioned Denoising Diffusion Probabilistic Model for Under-sampled Medical Image Reconstruction. arXiv preprint arXiv:2203.03623 (2022).
  298. Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models. In IEEE Conference on Computer Vision and Pattern Recognition.
  299. Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models. arXiv preprint arXiv:2212.14704 (2022).
  300. Self-supervised graph-level representation learning with local and global structure. In International Conference on Machine Learning. 11548–11558.
  301. GeoDiff: A Geometric Diffusion Model for Molecular Conformation Generation. In International Conference on Learning Representations.
  302. Versatile Diffusion: Text, Images and Variations All in One Diffusion Model. arXiv preprint arXiv:2211.08332 (2022).
  303. ScoreGrad: Multivariate Probabilistic Time Series Forecasting with Continuous Energy-based Generative Models. arXiv preprint arXiv:2106.10121 (2021).
  304. Diffsound: Discrete Diffusion Model for Text-to-sound Generation. arXiv preprint arXiv:2207.09983 (2022).
  305. Visual anomaly detection for images: A survey. arXiv preprint arXiv:2109.13157 (2021).
  306. Kevin Yang and Dan Klein. 2021. FUDGE: Controlled Text Generation With Future Discriminators. (2021).
  307. Ling Yang and Shenda Hong. 2022a. Omni-Granular Ego-Semantic Propagation for Self-Supervised Graph Representation Learning. arXiv preprint arXiv:2205.15746 (2022).
  308. Ling Yang and Shenda Hong. 2022b. Unsupervised Time-Series Representation Learning with Iterative Bilinear Temporal-Spectral Fusion. In International Conference on Machine Learning. 25038–25054.
  309. Diffusion-Based Scene Graph to Image Generation with Masked Contrastive Pre-Training. arXiv preprint arXiv:2211.11138 (2022).
  310. Dpgn: Distribution propagation graph network for few-shot learning. In IEEE Conference on Computer Vision and Pattern Recognition. 13390–13399.
  311. Score-Based Graph Generative Modeling with Self-Guided Latent Diffusion. (2023). https://openreview.net/forum?id=AykEgQNPJEK
  312. Ruihan Yang and Stephan Mandt. 2022. Lossy Image Compression with Conditional Diffusion Models. arXiv preprint arXiv:2209.06950 (2022).
  313. Diffusion probabilistic modeling for video generation. arXiv preprint arXiv:2203.09481 (2022).
  314. ST-MVL: filling missing values in geo-sensory time series data. In Proceedings of the 25th International Joint Conference on Artificial Intelligence.
  315. Adversarial purification with score-based generative models. In International Conference on Machine Learning. 12062–12072.
  316. Time-series generative adversarial networks. In Advances in Neural Information Processing Systems, Vol. 32.
  317. Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels. arXiv preprint arXiv:2302.10586 (2023).
  318. Coca: Contrastive captioners are image-text foundation models. arXiv preprint arXiv:2205.01917 (2022).
  319. Latent Diffusion Energy-Based Model for Interpretable Text Modelling. In International Conference on Machine Learning. 25702–25720.
  320. Generating videos with dynamics-aware implicit generative adversarial networks. arXiv preprint arXiv:2202.10571 (2022).
  321. Florence: A new foundation model for computer vision. arXiv preprint arXiv:2111.11432 (2021).
  322. Pre-training via denoising for molecular property prediction. In International Conference on Learning Representations.
  323. LION: Latent Point Diffusion Models for 3D Shape Generation. In Advances in Neural Information Processing Systems.
  324. Lvmin Zhang and Maneesh Agrawala. 2023. Adding conditional control to text-to-image diffusion models. arXiv preprint arXiv:2302.05543 (2023).
  325. Motiondiffuse: Text-driven human motion generation with diffusion model. arXiv preprint arXiv:2208.15001 (2022).
  326. Qinsheng Zhang and Yongxin Chen. 2021. Diffusion Normalizing Flow. In Advances in Neural Information Processing Systems, Vol. 34. 16280–16291.
  327. Qinsheng Zhang and Yongxin Chen. 2022. Fast Sampling of Diffusion Models with Exponential Integrator. arXiv preprint arXiv:2204.13902 (2022).
  328. gDDIM: Generalized denoising diffusion implicit models. arXiv preprint arXiv:2206.05564 (2022).
  329. Opt: Open pre-trained transformer language models. arXiv preprint arXiv:2205.01068 (2022).
  330. Cross Reconstruction Transformer for Self-Supervised Time Series Representation Learning. arXiv preprint arXiv:2205.09928 (2022).
  331. Energy-based generative adversarial network. arXiv preprint arXiv:1609.03126 (2016).
  332. Egsde: Unpaired image-to-image translation via energy-guided stochastic differential equations. arXiv preprint arXiv:2207.06635 (2022).
  333. PyOD: A Python Toolbox for Scalable Outlier Detection. Journal of Machine Learning Research 20 (2019), 1–7.
  334. Truncated diffusion probabilistic models. arXiv preprint arXiv:2202.09671 (2022).
  335. Uni-mol: A universal 3d molecular representation learning framework. In International Conference on Learning Representations.
  336. Graph neural networks: A review of methods and applications. AI Open 1 (2020), 57–81.
  337. 3d shape generation and completion through point-voxel diffusion. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5826–5835.
  338. Discrete contrastive diffusion for cross-modal and conditional generation. arXiv preprint arXiv:2206.07771 (2022).
  339. Deep graph contrastive representation learning. arXiv preprint arXiv:2006.04131 (2020).
  340. Score-based generative classifiers. arXiv preprint arXiv:2110.00473 (2021).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Ling Yang (88 papers)
  2. Zhilong Zhang (20 papers)
  3. Yang Song (298 papers)
  4. Shenda Hong (56 papers)
  5. Runsheng Xu (40 papers)
  6. Yue Zhao (394 papers)
  7. Wentao Zhang (261 papers)
  8. Bin Cui (165 papers)
  9. Ming-Hsuan Yang (376 papers)
Citations (1,020)
Github Logo Streamline Icon: https://streamlinehq.com