Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 59 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 32 tok/s Pro
GPT-5 High 33 tok/s Pro
GPT-4o 127 tok/s Pro
Kimi K2 189 tok/s Pro
GPT OSS 120B 421 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models (2404.11098v4)

Published 17 Apr 2024 in cs.CV

Abstract: In the era of AIGC, the demand for low-budget or even on-device applications of diffusion models emerged. In terms of compressing the Stable Diffusion models (SDMs), several approaches have been proposed, and most of them leveraged the handcrafted layer removal methods to obtain smaller U-Nets, along with knowledge distillation to recover the network performance. However, such a handcrafting manner of layer removal is inefficient and lacks scalability and generalization, and the feature distillation employed in the retraining phase faces an imbalance issue that a few numerically significant feature loss terms dominate over others throughout the retraining process. To this end, we proposed the layer pruning and normalized distillation for compressing diffusion models (LAPTOP-Diff). We, 1) introduced the layer pruning method to compress SDM's U-Net automatically and proposed an effective one-shot pruning criterion whose one-shot performance is guaranteed by its good additivity property, surpassing other layer pruning and handcrafted layer removal methods, 2) proposed the normalized feature distillation for retraining, alleviated the imbalance issue. Using the proposed LAPTOP-Diff, we compressed the U-Nets of SDXL and SDM-v1.5 for the most advanced performance, achieving a minimal 4.0% decline in PickScore at a pruning ratio of 50% while the comparative methods' minimal PickScore decline is 8.2%.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (55)
  1. Daria Bakshandaeva Christoph Schuhmann Ksenia Ivanova Alex Shonenkov, Misha Konstantinov and Nadiia Klokova. 2023. If by deepfloyd lab at stabilityai. https://huggingface.co/spaces/DeepFloyd/IF
  2. A study on the evaluation of generative models. arXiv preprint arXiv:2206.10935 (2022).
  3. Ollin Boer Bohan. 2023. Sdxl-vae-fp16-fix. https://huggingface.co/madebyollin/sdxl-vae-fp16-fix
  4. Shi Chen and Qi Zhao. 2018. Shallowing deep networks: Layer-wise pruning based on feature representations. TPAMI (2018).
  5. Speed is all you need: On-device acceleration of large diffusion models via gpu-aware optimizations. In CVPR.
  6. Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. In NeurIPS.
  7. To filter prune, or to layer prune, that is the question. In ACCV.
  8. Structural pruning for diffusion models. In NeurIPS.
  9. Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss. arXiv preprint arXiv:2401.02677 (2024).
  10. CLIPScore: A Reference-free Evaluation Metric for Image Captioning. In EMNLP.
  11. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In NeurIPS.
  12. Distilling the knowledge in a neural network. In NeurIPS Workshop.
  13. Denoising diffusion probabilistic models. In NeurIPS.
  14. Jonathan Ho and Tim Salimans. 2021. Classifier-Free Diffusion Guidance. In NeurIPS 2021 Workshop on Deep Generative Models and Downstream Applications.
  15. Shortened LLaMA: A Simple Depth Pruning for Large Language Models. arXiv preprint arXiv:2402.02834 (2024).
  16. On architectural compression of text-to-image diffusion models. arXiv preprint arXiv:2305.15798 (2023).
  17. Pick-a-pic: An open dataset of user preferences for text-to-image generation. In NeurIPS.
  18. pickapic_v1. https://huggingface.co/datasets/yuvalkirstain/pickapic_v1
  19. KOALA: Self-attention matters in knowledge distillation of latent diffusion models for memory-efficient and fast image synthesis. arXiv preprint arXiv:2312.04005 (2023).
  20. Faster diffusion: Rethinking the role of unet encoder in diffusion models. arXiv preprint arXiv:2312.09608 (2023).
  21. Q-diffusion: Quantizing diffusion models. In ICCV.
  22. Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow. In ICLR.
  23. Dpm-solver: A fast ode solver for diffusion probabilistic model sampling in around 10 steps. In NeurIPS.
  24. Dpm-solver++: Fast solver for guided sampling of diffusion probabilistic models. arXiv preprint arXiv:2211.01095 (2022).
  25. Latent consistency models: Synthesizing high-resolution images with few-step inference. arXiv preprint arXiv:2310.04378 (2023).
  26. Deepcache: Accelerating diffusion models for free. arXiv preprint arXiv:2312.00858 (2023).
  27. Stable-diffusion-xl-base-1.0. https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0
  28. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. In ICLR.
  29. Learning transferable visual models from natural language supervision. In ICML.
  30. Zero-shot text-to-image generation. In ICML.
  31. High-resolution image synthesis with latent diffusion models. In CVPR.
  32. Stable-diffusion-v1-5. https://huggingface.co/runwayml/stable-diffusion-v1-5
  33. Fitnets: Hints for thin deep nets. In ICLR.
  34. Photorealistic text-to-image diffusion models with deep language understanding. In NeurIPS.
  35. Tim Salimans and Jonathan Ho. 2022. Progressive Distillation for Fast Sampling of Diffusion Models. In ICLR.
  36. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108 (2019).
  37. Adversarial diffusion distillation. arXiv preprint arXiv:2311.17042 (2023).
  38. Laion-aesthetics v2 6+. https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6plus
  39. Laion2B-en. https://huggingface.co/datasets/laion/laion2B-en
  40. Segmind. 2023a. Small-sd. https://huggingface.co/segmind/small-sd
  41. Segmind. 2023b. Tiny-sd. https://huggingface.co/segmind/tiny-sd
  42. SG_161222. 2023. Realistic Vision. https://civitai.com/models/4201?modelVersionId=114367
  43. Post-training quantization on diffusion models. In CVPR.
  44. socalguitarist. 2023. ProtoVision XL. https://civitai.com/models/125703?modelVersionId=172397
  45. Denoising Diffusion Implicit Models. In ICLR.
  46. Score-Based Generative Modeling through Stochastic Differential Equations. In ICLR.
  47. RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization. In ECCV.
  48. Human preference score v2: A solid benchmark for evaluating human preferences of text-to-image synthesis. arXiv preprint arXiv:2306.09341 (2023).
  49. Better aligning text-to-image models with human preference. In ICCV.
  50. Imagereward: Learning and evaluating human preferences for text-to-image generation. In NeurIPS.
  51. ImageRewardDB. https://huggingface.co/datasets/THUDM/ImageRewardDB
  52. Efficient joint optimization of layer-adaptive weight pruning in deep neural networks. In ICCV.
  53. Ufogen: You forward once large scale text-to-image generation via diffusion gans. arXiv preprint arXiv:2311.09257 (2023).
  54. Zavy. 2023. ZavyChromaXL. https://civitai.com/models/119229/zavychromaxl
  55. MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices. arXiv preprint arXiv:2311.16567 (2023).
Citations (11)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 post and received 0 likes.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube