Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Missing U for Efficient Diffusion Models (2310.20092v4)

Published 31 Oct 2023 in cs.LG and cs.CV

Abstract: Diffusion Probabilistic Models stand as a critical tool in generative modelling, enabling the generation of complex data distributions. This family of generative models yields record-breaking performance in tasks such as image synthesis, video generation, and molecule design. Despite their capabilities, their efficiency, especially in the reverse process, remains a challenge due to slow convergence rates and high computational costs. In this paper, we introduce an approach that leverages continuous dynamical systems to design a novel denoising network for diffusion models that is more parameter-efficient, exhibits faster convergence, and demonstrates increased noise robustness. Experimenting with Denoising Diffusion Probabilistic Models (DDPMs), our framework operates with approximately a quarter of the parameters, and $\sim$ 30\% of the Floating Point Operations (FLOPs) compared to standard U-Nets in DDPMs. Furthermore, our model is notably faster in inference than the baseline when measured in fair and equal conditions. We also provide a mathematical intuition as to why our proposed reverse process is faster as well as a mathematical discussion of the empirical tradeoffs in the denoising downstream task. Finally, we argue that our method is compatible with existing performance enhancement techniques, enabling further improvements in efficiency, quality, and speed.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Abien Fred Agarap. Deep learning using rectified linear units (relu), 2019.
  2. Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. arXiv preprint arXiv:2201.06503, 2022.
  3. The perception-distortion tradeoff. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, June 2018. doi: 10.1109/cvpr.2018.00652. URL http://dx.doi.org/10.1109/CVPR.2018.00652.
  4. Neural ordinary differential equations. Advances in neural information processing systems, 31, 2018.
  5. Continuous u-net: Faster, greater and noiseless. arXiv preprint arXiv:2302.00626, 2023.
  6. Generalizing variational autoencoders with hierarchical empirical bayes, 2020.
  7. Improving diffusion models for inverse problems using manifold constraints. Advances in Neural Information Processing Systems, 35:25683–25696, 2022.
  8. Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Transactions on Image Processing, 16(8):2080–2095, 2007. doi: 10.1109/TIP.2007.901238.
  9. Diffusion models beat gans on image synthesis, 2021.
  10. Augmented neural odes, 2019.
  11. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  12. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  13. Cascaded diffusion models for high fidelity image generation, 2021.
  14. Imagen video: High definition video generation with diffusion models, 2022.
  15. Image quality metrics: Psnr vs. ssim. In 2010 20th International Conference on Pattern Recognition, pp.  2366–2369, 2010. doi: 10.1109/ICPR.2010.579.
  16. Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
  17. Glow: Generative flow with invertible 1x1 convolutions, 2018.
  18. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  19. Diffwave: A versatile diffusion model for audio synthesis, 2021.
  20. Diffsinger: Singing voice synthesis via shallow diffusion mechanism, 2022.
  21. Accelerating diffusion models via early stop of the diffusion process. arXiv preprint arXiv:2205.12524, 2022.
  22. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pp.  8162–8171. PMLR, 2021.
  23. On second order behaviour in augmented neural odes. Advances in neural information processing systems, 33:5911–5921, 2020.
  24. Diffusion autoencoders: Toward a meaningful and decodable representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10619–10629, 2022.
  25. Hierarchical text-conditional image generation with clip latents, 2022.
  26. Variational inference with normalizing flows, 2015.
  27. Stochastic backpropagation and approximate inference in deep generative models, 2014.
  28. High-resolution image synthesis with latent diffusion models, 2022.
  29. Latent odes for irregularly-sampled time series, 2019.
  30. Photorealistic text-to-image diffusion models with deep language understanding, 2022.
  31. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022.
  32. Deep unsupervised learning using nonequilibrium thermodynamics. In International conference on machine learning, pp.  2256–2265. PMLR, 2015.
  33. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502, 2020a.
  34. Generative modeling by estimating gradients of the data distribution, 2020.
  35. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456, 2020b.
  36. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE Transactions on Image Processing, 26(7):3142–3155, July 2017. ISSN 1941-0042. doi: 10.1109/tip.2017.2662206. URL http://dx.doi.org/10.1109/TIP.2017.2662206.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets