A Unified End-to-End Framework for Efficient Deep Image Compression (2002.03370v3)

Published 9 Feb 2020 in eess.IV and cs.CV

Abstract: Image compression is a widely used technique to reduce the spatial redundancy in images. Recently, learning based image compression has achieved significant progress by using the powerful representation ability from neural networks. However, the current state-of-the-art learning based image compression methods suffer from the huge computational cost, which limits their capacity for practical applications. In this paper, we propose a unified framework called Efficient Deep Image Compression (EDIC) based on three new technologies, including a channel attention module, a Gaussian mixture model and a decoder-side enhancement module. Specifically, we design an auto-encoder style network for learning based image compression. To improve the coding efficiency, we exploit the channel relationship between latent representations by using the channel attention module. Besides, the Gaussian mixture model is introduced for the entropy model and improves the accuracy for bitrate estimation. Furthermore, we introduce the decoder-side enhancement module to further improve image compression performance. Our EDIC method can also be readily incorporated with the Deep Video Compression (DVC) framework to further improve the video compression performance. Simultaneously, our EDIC method boosts the coding performance significantly while bringing slightly increased computational cost. More importantly, experimental results demonstrate that the proposed approach outperforms the current state-of-the-art image compression methods and is up to more than 150 times faster in terms of decoding speed when compared with Minnen's method. The proposed framework also successfully improves the performance of the recent deep video compression system DVC. Our code will be released at https://github.com/liujiaheng/compression.

PDF Abstract

Efficient Deep Image Compression: A Unified Framework

The paper in focus introduces a novel framework for deep image compression, termed Efficient Deep Image Compression (EDIC). Recognizing the limitations of current state-of-the-art methods in computational cost, the authors propose a model that integrates several advanced techniques aimed at not only improving compression performance but also enhancing computational efficiency. This work makes significant contributions to the field of image compression by addressing pressing issues regarding speed and accuracy while providing ample groundwork for further applications in video compression.

Theoretical Contributions

The EDIC framework is constructed upon three core components: a channel attention module, a Gaussian mixture model (GMM), and a decoder-side enhancement module. These components combine to form a comprehensive and optimized pipeline for image compression.

Channel Attention Module: The paper leverages a channel attention mechanism, building on the idea that neural networks can exploit inter-channel dependencies to refine latent representations. This module focuses computational resources on the more critical parts of the data, streamlining the encoding process. The introduction of channel-wise attention is a pioneering concept that emphasizes improving compression efficiency through better internal representation of data.
Gaussian Mixture Model for Entropy Estimation: Instead of conventional single Gaussian entropy models, the authors propose a GMM to characterize distribution more effectively. This approach allows for modeling complex data distributions with greater fidelity, particularly in regions with high spatial variability. By utilizing a mixture model, the framework can achieve a highly precise estimation of bit allocation, thereby enhancing compression accuracy without drastically increasing the model's complexity.
Decoder-side Enhancement Module: Addressing compression artifacts, the decoder-side enhancement module refines the reconstructed images to mitigate quality loss. Using residual blocks, this module predicts and restores high-frequency components that might be missed during the compression process, thereby providing a significant improvement in final image quality.

Practical Implications

One of the remarkable results reported in the paper is the significant increase in decoding speed, with EDIC demonstrating a performance up to 150 times faster than Minnen's method while yielding comparable image quality. Such findings suggest that EDIC could readily be integrated into real-time applications where bandwidth conservation and low latency are crucial, such as streaming platforms and mobile device applications.

Moreover, the authors demonstrate the versatility of EDIC by adapting it for deep video compression, incorporating its techniques into the DVC framework. This adaptation highlights the potential for EDIC to advance video compression systems, promising improvements in storage and transmission efficiency across diverse multimedia contexts.

Future Directions

The introduction of EDIC marks an important step toward more efficient and accurate learning-based image compression. This research provides a compelling basis for extending scalability in compression systems, suggesting further exploration of hybrid models that combine traditional and deep learning methods. The integration of the proposed modules with existing frameworks could catalyze breakthroughs in both image and video compression technology. Looking ahead, we can anticipate future developments to focus on refining these models, enhancing their adaptability across variable bitrates, and extending their applicability to complex, dynamic image sequences such as those found in real-time video streams.

In summary, this paper offers significant advancements in deep image compression, laying the groundwork for future exploration in both theoretical and practical domains. By addressing computational efficiency head-on and providing a robust solution for high-quality compression, the EDIC framework stands as a testament to the evolving capabilities of neural networks in image processing.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Jiaheng Liu (100 papers)
Guo Lu (39 papers)
Zhihao Hu (16 papers)
Dong Xu (167 papers)

Citations (89)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - liujiaheng/CompressionData: The training data of learned image compression. The data is from flicker.com. (41 stars)