Enhanced Invertible Encoding for Learned Image Compression (2108.03690v1)

Published 8 Aug 2021 in eess.IV and cs.CV

Abstract: Although deep learning based image compression methods have achieved promising progress these days, the performance of these methods still cannot match the latest compression standard Versatile Video Coding (VVC). Most of the recent developments focus on designing a more accurate and flexible entropy model that can better parameterize the distributions of the latent features. However, few efforts are devoted to structuring a better transformation between the image space and the latent feature space. In this paper, instead of employing previous autoencoder style networks to build this transformation, we propose an enhanced Invertible Encoding Network with invertible neural networks (INNs) to largely mitigate the information loss problem for better compression. Experimental results on the Kodak, CLIC, and Tecnick datasets show that our method outperforms the existing learned image compression methods and compression standards, including VVC (VTM 12.1), especially for high-resolution images. Our source code is available at https://github.com/xyq7/InvCompress.

PDF Abstract

Enhanced Invertible Encoding for Learned Image Compression

This paper proposes an innovative approach to image compression utilizing Enhanced Invertible Encoding Networks, particularly aimed at addressing the limitations found in existing learned image compression methodologies. Traditional lossy image compression techniques, such as JPEG, JPEG2000, and the more recent Versatile Video Coding (VVC), rely heavily on handcrafted transformation schemes. Despite their effectiveness, these methods often fail to fully exploit the potential of data-driven approaches facilitated by deep learning, which can adaptively learn representations well-suited for compression tasks.

The core novelty of this work pivots on the application of invertible neural networks (INNs) to the image compression problem. By leveraging the intrinsic invertibility of INNs, this approach circumvents the information loss challenge endemic to the traditional compression frameworks using autoencoders. The authors propose an Enhanced Invertible Encoding Network which integrates INNs with additional components such as an attentive channel squeeze layer and a feature enhancement module to build a robust and effective compression pipeline.

Methodology

INNs for Compression: The paper advocates for the use of invertible networks due to their bijective mapping capabilities, ensuring that the transformation between image space and latent feature space retains all necessary information in both directions. The authors discuss how standard autoencoder architectures are limited by their inherent information loss during encoding, a problem mitigated by INNs.
Attentive Channel Squeeze: To handle the invertible nature, which dictates that outputs must match input dimensions, the paper introduces the attentive channel squeeze mechanism. This technique effectively reduces dimensionality by focusing on the most contributive features, thereby stabilizing training and maintaining efficient compression rates.
Feature Enhancement Module: Recognizing the potential limitations in the non-linear transformation capacity of INNs, the authors propose a feature enhancement module based on dense connections to improve the system's ability to capture complex image features. This module operates in a residual manner to enhance model expressiveness without sacrificing invertibility.

Results and Implications

The experimental evaluation demonstrates that the proposed methods outperform contemporary image compression techniques, including the VVC standard, on several benchmark datasets like Kodak, CLIC, and Tecnick. Particularly noteworthy is the method's efficacy on high-resolution image datasets, suggesting its potential applicability to environments where detail preservation is crucial. The results are validated through both quantitative metrics like PSNR and MS-SSIM and qualitative visual comparisons showing superior retention of detail and reduced artifacts.

Future Prospects

The integration of INNs into learned image compression opens new avenues for research, suggesting several promising directions:

Scalability and Efficiency: While the method performs impressively on high-resolution images, exploring the scalability of such methods to real-time applications or in resource-constrained environments would be a natural extension.
Hybrid Models: Combining invertible networks with non-invertible components or different neural architectures could yield new hybrid models, potentially offering enhancements in terms of both performance and computational efficiency.
Broader Applications: Beyond image compression, the principles utilized here could be adapted to other domains such as audio and video compression, where lossless data inversion is equally beneficial.

In summary, this work demonstrates a significant advancement in image compression by blending the benefits of invertible network structures with targeted architectural enhancements. The continued exploration of such methodologies holds the potential to set new standards in multimedia compression technologies, bridging deep learning methods with practical compression requirements.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Yueqi Xie (22 papers)
Ka Leong Cheng (15 papers)
Qifeng Chen (187 papers)

Citations (156)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - xyq7/InvCompress: [ACMMM 2021 Oral] Enhanced Invertible Encoding for Learned Image Compression (120 stars)