Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 169 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 31 tok/s Pro
GPT-5 High 38 tok/s Pro
GPT-4o 104 tok/s Pro
Kimi K2 191 tok/s Pro
GPT OSS 120B 433 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Efficiency Meets Fidelity: A Novel Quantization Framework for Stable Diffusion (2412.06661v2)

Published 9 Dec 2024 in cs.CV

Abstract: Text-to-image generation via Stable Diffusion models (SDM) have demonstrated remarkable capabilities. However, their computational intensity, particularly in the iterative denoising process, hinders real-time deployment in latency-sensitive applications. While Recent studies have explored post-training quantization (PTQ) and quantization-aware training (QAT) methods to compress Diffusion models, existing methods often overlook the consistency between results generated by quantized models and those from floating-point models. This consistency is paramount for professional applications where both efficiency and output reliability are essential. To ensure that quantized SDM generates high-quality and consistent images, we propose an efficient quantization framework for SDM. Our framework introduces a Serial-to-Parallel pipeline that simultaneously maintains training-inference consistency and ensures optimization stability. Building upon this foundation, we further develop several techniques including multi-timestep activation quantization, time information precalculation, inter-layer distillation, and selective freezing, to achieve high-fidelity generation in comparison to floating-point models while maintaining quantization efficiency. Through comprehensive evaluation across multiple Stable Diffusion variants (v1-4, v2-1, XL 1.0, and v3), our method demonstrates superior performance over state-of-the-art approaches with shorter training times. Under W4A8 quantization settings, we achieve significant improvements in both distribution similarity and visual fidelity, while preserving a high image quality.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.