Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Temporal Dynamic Quantization for Diffusion Models (2306.02316v2)

Published 4 Jun 2023 in cs.CV

Abstract: The diffusion model has gained popularity in vision applications due to its remarkable generative performance and versatility. However, high storage and computation demands, resulting from the model size and iterative generation, hinder its use on mobile devices. Existing quantization techniques struggle to maintain performance even in 8-bit precision due to the diffusion model's unique property of temporal variation in activation. We introduce a novel quantization method that dynamically adjusts the quantization interval based on time step information, significantly improving output quality. Unlike conventional dynamic quantization techniques, our approach has no computational overhead during inference and is compatible with both post-training quantization (PTQ) and quantization-aware training (QAT). Our extensive experiments demonstrate substantial improvements in output quality with the quantized diffusion model across various datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Junhyuk So (6 papers)
  2. Jungwon Lee (53 papers)
  3. Daehyun Ahn (4 papers)
  4. Hyungjun Kim (18 papers)
  5. Eunhyeok Park (28 papers)
Citations (32)
Youtube Logo Streamline Icon: https://streamlinehq.com