Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Semi-Implicit Denoising Diffusion Models (SIDDMs) (2306.12511v3)

Published 21 Jun 2023 in cs.LG and cs.CV

Abstract: Despite the proliferation of generative models, achieving fast sampling during inference without compromising sample diversity and quality remains challenging. Existing models such as Denoising Diffusion Probabilistic Models (DDPM) deliver high-quality, diverse samples but are slowed by an inherently high number of iterative steps. The Denoising Diffusion Generative Adversarial Networks (DDGAN) attempted to circumvent this limitation by integrating a GAN model for larger jumps in the diffusion process. However, DDGAN encountered scalability limitations when applied to large datasets. To address these limitations, we introduce a novel approach that tackles the problem by matching implicit and explicit factors. More specifically, our approach involves utilizing an implicit model to match the marginal distributions of noisy data and the explicit conditional distribution of the forward diffusion. This combination allows us to effectively match the joint denoising distributions. Unlike DDPM but similar to DDGAN, we do not enforce a parametric distribution for the reverse step, enabling us to take large steps during inference. Similar to the DDPM but unlike DDGAN, we take advantage of the exact form of the diffusion process. We demonstrate that our proposed method obtains comparable generative performance to diffusion-based models and vastly superior results to models with a small number of sampling steps.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (52)
  1. Denoising diffusion probabilistic models. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020.
  2. Diffusion models beat GANs on image synthesis. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, 2021.
  3. High-resolution image synthesis with latent diffusion models, 2021.
  4. Photorealistic text-to-image diffusion models with deep language understanding. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors, Advances in Neural Information Processing Systems, 2022.
  5. Diffwave: A versatile diffusion model for audio synthesis. In International Conference on Learning Representations, 2021.
  6. Video diffusion models, 2022.
  7. Diffusion probabilistic models for 3d point cloud generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2021.
  8. Auto-Encoding Variational Bayes. In 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014.
  9. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 27. Curran Associates, Inc., 2014.
  10. Deep unsupervised learning using nonequilibrium thermodynamics. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 2256–2265, Lille, France, 07–09 Jul 2015. PMLR.
  11. Tackling the generative learning trilemma with denoising diffusion GANs. In International Conference on Learning Representations, 2022.
  12. A u-net based discriminator for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8207–8216, 2020.
  13. Spectral normalization for generative adversarial networks. In International Conference on Learning Representations, 2018.
  14. Wasserstein gan, 2017. cite arxiv:1701.07875.
  15. Differentiable augmentation for data-efficient gan training. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 7559–7570. Curran Associates, Inc., 2020.
  16. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021.
  17. Pascal Vincent. A connection between score matching and denoising autoencoders. Neural computation, 23(7):1661–1674, 2011.
  18. Generative Modeling by Estimating Gradients of the Data Distribution. Curran Associates Inc., Red Hook, NY, USA, 2019.
  19. Adversarial score matching and improved sampling for image generation. In International Conference on Learning Representations, 2021.
  20. Score matching model for unbounded data score. arXiv preprint arXiv:2106.05527, 2021.
  21. Elucidating the design space of diffusion-based generative models. In Proc. NeurIPS, 2022.
  22. Consistency models, 2023.
  23. Progressive distillation for fast sampling of diffusion models. In International Conference on Learning Representations, 2022.
  24. Noise estimation for generative diffusion models. arXiv preprint arXiv:2104.02600, 2021.
  25. Denoising diffusion implicit models. In International Conference on Learning Representations, 2021.
  26. On fast sampling of diffusion probabilistic models. arXiv preprint arXiv:2106.00132, 2021.
  27. Gotta go fast when generating data with score-based models. arXiv preprint arXiv:2105.14080, 2021.
  28. Conditional image synthesis with auxiliary classifier gans, 2017.
  29. Twin auxilary classifiers gan. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 1330–1339. Curran Associates, Inc., 2019.
  30. Differentiable augmentation for data-efficient gan training. In Advances in neural information processing systems, 2020.
  31. Training generative adversarial networks with limited data. In Advances in neural information processing systems, 2020.
  32. Diffusion-gan: Training gans with diffusion. arXiv preprint arXiv:2206.02262, 2022.
  33. Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, 2009.
  34. Progressive growing of gans for improved quality, stability, and variation. CoRR, abs/1710.10196, 2017.
  35. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 115(3):211–252, 2015.
  36. U-net: Convolutional networks for biomedical image segmentation, 2015. cite arxiv:1505.04597Comment: conditionally accepted at MICCAI 2015.
  37. Improved denoising diffusion probabilistic models. arXiv preprint arXiv:2102.09672, 2021.
  38. Knowledge distillation in iterative generative models for improved sampling speed. arXiv preprint arXiv:2101.02388, 2021.
  39. Score-based generative modeling in latent space. In Advances in neural information processing systems, 2021.
  40. NVAE: A deep hierarchical variational autoencoder. In Advances in neural information processing systems, 2020.
  41. Vaebm: A symbiosis between variational autoencoders and energy-based models. In International Conference on Learning Representations, 2021.
  42. A contrastive learning approach for training variational autoencoder priors. In Neural Information Processing Systems, 2021.
  43. Progressive growing of GANs for improved quality, stability, and variation. In International Conference on Learning Representations, 2018.
  44. Adversarial latent autoencoders. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2020.
  45. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2021.
  46. Dual contradistinctive generative autoencoder. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2021.
  47. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 2017.
  48. Improved techniques for training gans. Advances in neural information processing systems, 2016.
  49. Improved precision and recall metric for assessing generative models. arXiv preprint arXiv:1904.06991, 2019.
  50. Stylegan-xl: Scaling stylegan to large diverse datasets. volume abs/2201.00273, 2022.
  51. Alexandre B. Tsybakov. Introduction to Nonparametric Estimation. Springer Publishing Company, Incorporated, 1st edition, 2008.
  52. Robustness of conditional gans to noisy labels. In NeurIPS, pages 10271–10282. Curran Associates, Inc., 2018.
Citations (9)

Summary

We haven't generated a summary for this paper yet.