Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 57 tok/s
Gemini 2.5 Pro 52 tok/s Pro
GPT-5 Medium 20 tok/s Pro
GPT-5 High 19 tok/s Pro
GPT-4o 93 tok/s Pro
Kimi K2 176 tok/s Pro
GPT OSS 120B 449 tok/s Pro
Claude Sonnet 4.5 34 tok/s Pro
2000 character limit reached

SparseDM: Toward Sparse Efficient Diffusion Models (2404.10445v4)

Published 16 Apr 2024 in cs.LG and cs.AI

Abstract: Diffusion models represent a powerful family of generative models widely used for image and video generation. However, the time-consuming deployment, long inference time, and requirements on large memory hinder their applications on resource constrained devices. In this paper, we propose a method based on the improved Straight-Through Estimator to improve the deployment efficiency of diffusion models. Specifically, we add sparse masks to the Convolution and Linear layers in a pre-trained diffusion model, then transfer learn the sparse model during the fine-tuning stage and turn on the sparse masks during inference. Experimental results on a Transformer and UNet-based diffusion models demonstrate that our method reduces MACs by 50% while maintaining FID. Sparse models are accelerated by approximately 1.2x on the GPU. Under other MACs conditions, the FID is also lower than 1 compared to other methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Analytic-dpm: an analytic estimate of the optimal reverse variance in diffusion probabilistic models. In International Conference on Learning Representations, 2022.
  2. All are worth words: A vit backbone for diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  22669–22679, 2023.
  3. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
  4. Token merging for fast stable diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop, pp.  4598–4602, 2023.
  5. Schrodinger bridges beat diffusion models on text-to-speech synthesis. arXiv preprint arXiv:2312.03491, 2023.
  6. Optimal fine-grained n: M sparsity for activations and neural gradients. arXiv preprint arXiv:2203.10991, 2022.
  7. Depgraph: Towards any structural pruning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  16091–16101, 2023a.
  8. Structural pruning for diffusion models. NeurIPS, 2023b.
  9. Sparsegpt: Massive language models can be accurately pruned in one-shot. In International Conference on Machine Learning, pp.  10323–10337. PMLR, 2023.
  10. Compresso: Structured pruning with collaborative prompting learns compact large language models. arXiv preprint arXiv:2310.05015, 2023.
  11. Learning both weights and connections for efficient neural network. Advances in neural information processing systems, 28, 2015.
  12. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  13. Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. The Journal of Machine Learning Research, 22(1):10882–11005, 2021.
  14. Accelerated sparse neural training: A provable and efficient method to find n: m transposable masks. Advances in neural information processing systems, 34:21099–21111, 2021.
  15. Elucidating the design space of diffusion-based generative models. Advances in Neural Information Processing Systems, 35:26565–26577, 2022.
  16. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. Advances in Neural Information Processing Systems, 35:5775–5787, 2022.
  17. Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models. arXiv preprint arXiv:2211.01095, 2023.
  18. Latent consistency models: Synthesizing high-resolution images with few-step inference. arXiv preprint arXiv:2310.04378, 2023a.
  19. Lcm-lora: A universal stable-diffusion acceleration module. arXiv preprint arXiv:2311.05556, 2023b.
  20. Llm-pruner: On the structural pruning of large language models. arXiv preprint arXiv:2305.11627, 2023.
  21. Accelerating sparse deep neural networks. arXiv preprint arXiv:2104.08378, 2021.
  22. Early exiting for accelerated inference in diffusion models. In ICML 2023 Workshop on Structured Probabilistic Inference, Generative Modeling, 2023.
  23. Nvidia. Nvidia a100 tensor core gpu architecture. https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/nvidia-ampere-architecture-whitepaper.pdf, 2020.
  24. Channel permutations for n: M sparsity. Advances in neural information processing systems, 34:13316–13327, 2021.
  25. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  10684–10695, 2022.
  26. Progressive distillation for fast sampling of diffusion models. In International Conference on Learning Representations, 2022.
  27. Denoising diffusion implicit models. In International Conference on Learning Representations, 2021.
  28. Dominosearch: Find layer-wise fine-grained n: M sparse schemes from dense neural networks. Advances in neural information processing systems, 34:20721–20732, 2021.
  29. Pay attention to features, transfer learn faster cnns. In International conference on learning representations, 2020.
  30. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4):600–612, 2004.
  31. Learning best combination for efficient n: M sparsity. Advances in Neural Information Processing Systems, 35:941–953, 2022.
  32. Unipc: A unified predictor-corrector framework for fast sampling of diffusion models. NeurIPS, 2023.
  33. Dpm-solver-v3: Improved diffusion ode solver with empirical model statistics. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  34. Learning n: M fine-grained structured sparse neural networks from scratch. In International Conference on Learning Representations, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 2 posts and received 0 likes.