Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer (2403.03736v1)

Published 6 Mar 2024 in cs.CV, cs.LG, and eess.IV

Abstract: Recent progress in generative compression technology has significantly improved the perceptual quality of compressed data. However, these advancements primarily focus on producing high-frequency details, often overlooking the ability of generative models to capture the prior distribution of image content, thus impeding further bitrate reduction in extreme compression scenarios (<0.05 bpp). Motivated by the capabilities of predictive LLMs for lossless compression, this paper introduces a novel Unified Image Generation-Compression (UIGC) paradigm, merging the processes of generation and compression. A key feature of the UIGC framework is the adoption of vector-quantized (VQ) image models for tokenization, alongside a multi-stage transformer designed to exploit spatial contextual information for modeling the prior distribution. As such, the dual-purpose framework effectively utilizes the learned prior for entropy estimation and assists in the regeneration of lost tokens. Extensive experiments demonstrate the superiority of the proposed UIGC framework over existing codecs in perceptual quality and human perception, particularly in ultra-low bitrate scenarios (<=0.03 bpp), pioneering a new direction in generative compression.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (27)

Authors (5)

Naifu Xue (3 papers)
Qi Mao (22 papers)
Zijian Wang (99 papers)
Yuan Zhang (331 papers)
Siwei Ma (84 papers)

Citations (3)

View on Semantic Scholar

Unifying Generation and Compression: Ultra-low bitrate Image Coding Via Multi-stage Transformer (2403.03736v1)

Related Papers