Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Not All Steps are Equal: Efficient Generation with Progressive Diffusion Models (2312.13307v2)

Published 20 Dec 2023 in cs.LG, cs.AI, and cs.CV

Abstract: Diffusion models have demonstrated remarkable efficacy in various generative tasks with the predictive prowess of denoising model. Currently, these models employ a uniform denoising approach across all timesteps. However, the inherent variations in noisy latents at each timestep lead to conflicts during training, constraining the potential of diffusion models. To address this challenge, we propose a novel two-stage training strategy termed Step-Adaptive Training. In the initial stage, a base denoising model is trained to encompass all timesteps. Subsequently, we partition the timesteps into distinct groups, fine-tuning the model within each group to achieve specialized denoising capabilities. Recognizing that the difficulties of predicting noise at different timesteps vary, we introduce a diverse model size requirement. We dynamically adjust the model size for each timestep by estimating task difficulty based on its signal-to-noise ratio before fine-tuning. This adjustment is facilitated by a proxy-based structural importance assessment mechanism, enabling precise and efficient pruning of the base denoising model. Our experiments validate the effectiveness of the proposed training strategy, demonstrating an improvement in the FID score on CIFAR10 by over 0.3 while utilizing only 80\% of the computational resources. This innovative approach not only enhances model performance but also significantly reduces computational costs, opening new avenues for the development and application of diffusion models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. arXiv preprint arXiv:2201.06503, 2022.
  2. Low-code llm: Visual programming over llms. arXiv preprint arXiv:2304.08103, 2023.
  3. Generative adversarial networks: An overview. IEEE signal processing magazine, 35(1):53–65, 2018.
  4. Structural pruning for diffusion models. arXiv preprint arXiv:2305.10924, 2023.
  5. Pruning convolution neural network (squeezenet) using taylor expansion-based criterion. In 2018 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pages 1–5. IEEE, 2018.
  6. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  7. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015a.
  8. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015b.
  9. Efficient diffusion training via min-snr weighting strategy. arXiv preprint arXiv:2303.09556, 2023.
  10. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020.
  11. Imagen video: High definition video generation with diffusion models. arXiv preprint arXiv:2210.02303, 2022.
  12. Prodiff: Progressive fast diffusion model for high-quality text-to-speech. In Proceedings of the 30th ACM International Conference on Multimedia, pages 2595–2605, 2022.
  13. Glow: Generative flow with invertible 1x1 convolutions. Advances in neural information processing systems, 31, 2018.
  14. Nerf-vae: A geometry aware 3d scene generative model. In International Conference on Machine Learning, pages 5742–5752. PMLR, 2021.
  15. Optimal brain damage. Advances in neural information processing systems, 2, 1989.
  16. Layer-adaptive sparsity for the magnitude-based pruning. arXiv preprint arXiv:2010.07611, 2020.
  17. Compressing deep convolutional networks using k-means based on weights distribution. In Proceedings of the 2nd International Conference on Intelligent Information Processing, pages 1–6, 2017.
  18. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710, 2016.
  19. Accelerating convolutional networks via global & dynamic filter pruning. In IJCAI, page 8. Stockholm, 2018.
  20. Pseudo numerical methods for diffusion models on manifolds. arXiv preprint arXiv:2202.09778, 2022.
  21. Learning efficient convolutional networks through network slimming. In Proceedings of the IEEE international conference on computer vision, pages 2736–2744, 2017.
  22. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
  23. Gpteval: A survey on assessments of chatgpt and gpt-4. arXiv preprint arXiv:2308.12488, 2023.
  24. Least squares generative adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2794–2802, 2017.
  25. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440, 2016.
  26. Importance estimation for neural network pruning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11264–11272, 2019.
  27. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
  28. Improved denoising diffusion probabilistic models. In International Conference on Machine Learning, pages 8162–8171. PMLR, 2021.
  29. Vaes meet diffusion models: Efficient and high-fidelity generation. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications, 2021.
  30. Wavelet diffusion models are fast and scalable image generators. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10199–10208, 2023.
  31. Image super-resolution via iterative refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4713–4726, 2022.
  32. Progressive distillation for fast sampling of diffusion models. arXiv preprint arXiv:2202.00512, 2022.
  33. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  34. D2c: Diffusion-decoding models for few-shot conditional generation. Advances in Neural Information Processing Systems, 34:12533–12548, 2021.
  35. Improved techniques for training score-based generative models. Advances in neural information processing systems, 33:12438–12448, 2020.
  36. Maximum likelihood training of score-based diffusion models. Advances in Neural Information Processing Systems, 34:1415–1428, 2021.
  37. Score-based generative modeling in latent space. Advances in Neural Information Processing Systems, 34:11287–11302, 2021.
  38. Generative adversarial networks: introduction and outlook. IEEE/CAA Journal of Automatica Sinica, 4(4):588–598, 2017.
  39. On a mixture autoregressive model. Journal of the Royal Statistical Society Series B: Statistical Methodology, 62(1):95–115, 2000.
  40. Diffusion probabilistic modeling for video generation. arXiv preprint arXiv:2203.09481, 2022.
  41. gddim: Generalized denoising diffusion implicit models. arXiv preprint arXiv:2206.05564, 2022.
  42. Can gpt-4 perform neural architecture search? arXiv preprint arXiv:2304.10970, 2023.
Citations (1)

Summary

We haven't generated a summary for this paper yet.