PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution (2411.17106v3)

Published 26 Nov 2024 in cs.CV

Abstract: Diffusion-based image super-resolution (SR) models have shown superior performance at the cost of multiple denoising steps. However, even though the denoising step has been reduced to one, they require high computational costs and storage requirements, making it difficult for deployment on hardware devices. To address these issues, we propose a novel post-training quantization approach with adaptive scale in one-step diffusion (OSD) image SR, PassionSR. First, we simplify OSD model to two core components, UNet and Variational Autoencoder (VAE) by removing the CLIPEncoder. Secondly, we propose Learnable Boundary Quantizer (LBQ) and Learnable Equivalent Transformation (LET) to optimize the quantization process and manipulate activation distributions for better quantization. Finally, we design a Distributed Quantization Calibration (DQC) strategy that stabilizes the training of quantized parameters for rapid convergence. Comprehensive experiments demonstrate that PassionSR with 8-bit and 6-bit obtains comparable visual results with full-precision model. Moreover, our PassionSR achieves significant advantages over recent leading low-bit quantization methods for image SR. Our code will be at https://github.com/libozhu03/PassionSR.

Summary

The paper presents PassionSR, which simplifies the one-step diffusion model by removing redundant branches to reduce parameters by 27.13% and operations by 6.25%.
The paper introduces learnable quantization techniques (LBQ and LET) that optimize boundary parameters and activation scales, effectively narrowing the performance gap of quantized models.
The paper employs a distributed quantization calibration strategy to stabilize training, achieving near full-precision performance across 6-bit and 8-bit settings with competitive metric scores.

Analysis of PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion-Based Image Super-Resolution

The paper "PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion Based Image Super-Resolution" introduces a novel approach to address the implications of computational costs and storage demands in diffusion-based image super-resolution models. The authors present PassionSR, a methodology focusing on the one-step diffusion (OSD) model, which is defined as a more efficient alternative to conventional multi-step diffusion models due to its reduced inference latency.

Key Contributions

The authors highlight three main components introduced in PassionSR:

Model Simplification: The researchers streamline the original OSD model by removing the DAPE and CLIPEncoder branches, pruning the model to concentrate on two primary components: UNet and VAE. This simplification aims to maintain the model's performance while significantly reducing its parameter count and operational requirements. As detailed in the paper, the parameters and operations are decreased by approximately 27.13% and 6.25% respectively.
Learnable Quantization Techniques: The main innovation lies in the introduction of the Learnable Boundary Quantizer (LBQ) and Learnable Equivalent Transformation (LET). The LBQ is designed to bridge the performance gap traditionally observed in quantized models by allowing trainable boundary parameter optimization. LET adjusts activation distributions dynamically via adaptable scale parameters, enhancing the efficiency of the quantization process without incurring additional computational overhead.
Distributed Quantization Calibration Strategy: PassionSR employs Distributed Quantization Calibration (DQC), which stabilizes the training process of quantized parameters and accelerates convergence. Through DQC, the calibration of quantized parameters is divided into stages to manage the complexity and instability that can arise from simultaneous training adjustments.

Experimental Evaluation

The evaluations illustrate that PassionSR, under both 6-bit and 8-bit precision settings, achieves performance levels close to that of a full-precision model. Comprehensive experiments on various datasets yield competitive results in terms of PSNR, SSIM, and LPIPS metrics, demonstrating PassionSR's effective balance between computational efficiency and perceptual quality.

Specifically, the 8-bit quantization setting of PassionSR achieves an impressive 81.77% compression in parameters and a reduction of operations by 76.56%. Even under the more challenging 6-bit quantization setting, the methodology still maintains commendable performance, indicating its robustness and effectiveness across different quantization levels.

Implications and Future Directions

The PassionSR paper posits substantial implications for the deployment of diffusion-based image super-resolution models in resource-constrained environments, such as mobile devices. By significantly reducing computational and storage requirements while maintaining model performance, PassionSR can potentially enable broader and more practical applications of advanced SR models.

The research indicates opportunities for further exploration into post-training quantization techniques, particularly in the context of one-step diffusion models. Future work may investigate the integration of PassionSR with other model compression techniques, such as pruning or structured sparsity, to further enhance computational efficiency. Additionally, exploring the combination of PassionSR with other emergent AI technologies could push the boundaries of real-time image processing applications.

In summary, PassionSR contributes a viable solution to one-step diffusion-based SR model quantization by addressing traditional limitations associated with model deployment in constrained settings. The blend of model simplification, innovative quantization techniques, and effective calibration strategies underscores its potential as a new standard for efficient diffusion-based image processing.

PDF Markdown

Related Papers

GitHub

GitHub - libozhu03/PassionSR (8 stars)

Tweets

https://twitter.com/rohanpaul_ai/status/1863912589803368877

https://twitter.com/CSVisionPapers/status/1862133543268155548

HackerNews

Super-Resolution with Low-Bit Quantization (2 points, 0 comments)