Checkerboard Context Model for Efficient Learned Image Compression (2103.15306v2)

Published 29 Mar 2021 in eess.IV and cs.CV

Abstract: For learned image compression, the autoregressive context model is proved effective in improving the rate-distortion (RD) performance. Because it helps remove spatial redundancies among latent representations. However, the decoding process must be done in a strict scan order, which breaks the parallelization. We propose a parallelizable checkerboard context model (CCM) to solve the problem. Our two-pass checkerboard context calculation eliminates such limitations on spatial locations by re-organizing the decoding order. Speeding up the decoding process more than 40 times in our experiments, it achieves significantly improved computational efficiency with almost the same rate-distortion performance. To the best of our knowledge, this is the first exploration on parallelization-friendly spatial context model for learned image compression.

Authors (5)

Dailan He (25 papers)
Yaoyan Zheng (3 papers)
Baocheng Sun (7 papers)
Yan Wang (733 papers)
Hongwei Qin (38 papers)

Citations (227)

View on Semantic Scholar

Summary

The paper introduces a parallelizable checkerboard context model that restructures decoding to overcome the limitations of sequential autoregressive methods.
It employs a two-pass checkerboard approach, categorizing latent variables into anchors and non-anchors to capture spatial dependencies in parallel.
Experimental results reveal over 40x improvement in decoding speed with competitive rate-distortion performance on standard image datasets.

Analyzing "Checkerboard Context Model for Efficient Learned Image Compression"

The paper "Checkerboard Context Model for Efficient Learned Image Compression" presents a novel approach aimed at addressing the limitations found in existing methods for learned image compression. The focus is a parallelizable context model, designed to enhance computational efficiency without compromising rate-distortion (RD) performance. This model employs a checkerboard pattern to facilitate parallel processing, resolving the bottleneck associated with existing autoregressive context models.

Advances in Image Compression Techniques

In image compression, reducing spatial and statistical redundancies is paramount. Conventional methods, such as JPEG and BPG, utilize lossless transformations followed by entropy coding to achieve this. Recent advances have introduced learned image compression methods using deep learning frameworks, such as convolutional autoencoders and generative adversarial networks. These methods have demonstrated superior performance compared to traditional algorithms, including noteworthy results in peak signal-to-noise ratio (PSNR) and multi-scale structural similarity (MS-SSIM).

The learned models often use entropy minimization strategies, wherein latent representations are transformed through non-linear encoding. A notable improvement in this field has been the introduction of a hyperprior, providing a side-information layer that aids in the precise approximation of latent distributions.

The Autoregressive Context Model Challenge

While autoregressive context models have improved RD performance by leveraging spatial dependencies during decoding, their requirement for sequential processing severely limits parallelization. This paper identifies the potential for significant computational speedup through an innovative re-structuring of the decoding order without damaging compression performance.

Introduction of the Checkerboard Context Model

The authors propose a Checkerboard Context Model (CCM), which supports parallel operations by redesigning the context calculation and decoding order. By employing a two-pass checkerboard context calculation, spatial redundancies can be addressed without constricting spatial location dependencies. This design facilitates speed enhancements in the decoding phase, claiming improvements by more than a factor of 40 in experimental settings, with little compromise on RD objectives.

Methodological Breakdown

Context Modeling and Decoding Order: The checkerboard context model restructures how context is modeled in latent representations. In the proposed approach, visible latents are grouped into 'anchors' and 'non-anchors,' whereby anchors are decoded using hyperprior alone, and non-anchors utilize context relationships calculated in parallel.
Efficiency in Practical Implementations: Experimental results demonstrate that employing this checkerboard model can significantly enhance practicality without an evident degradation in performance. The parallelization potential posited by this model allows faster computation, increasing the real-world deployment potential for learned image codecs.

Experimental Results and Comparative Analysis

The paper provides rigorous experiments comparing the checkerboard context model to both the traditional serial models and recent learned models devoid of spatial context. The checkerboard model exhibits comparable RD performance to leading models while offering significantly enhanced decoding speeds. Its superiority is demonstrated in large-scale image datasets (Kodak and Tecnick), which typically challenge conventional methods due to their size and complexity.

Implications and Future Prospects

By addressing the computational inefficiencies inherent to serial context models, the checkerboard context model positions itself as a viable path forward for the practical deployment of learned image compression methods. This advancement hints at potential explorations into multi-dimensional context modeling techniques that promise further improvements in efficiency without conceding compression quality.

Conclusion

This paper solidifies its contribution by proposing a method that effectively balances computational efficiency and compression efficacy. As deep learning models continue to evolve, the principles established here suggest that leveraging parallel architectures will be critical in future advancements. The checkerboard context model represents a substantial step toward practical, high-performance learned image compression systems.

PDF Markdown