Compression Artifacts Reduction by a Deep Convolutional Network (1504.06993v1)

Published 27 Apr 2015 in cs.CV

Abstract: Lossy compression introduces complex compression artifacts, particularly the blocking artifacts, ringing effects and blurring. Existing algorithms either focus on removing blocking artifacts and produce blurred output, or restores sharpened images that are accompanied with ringing effects. Inspired by the deep convolutional networks (DCN) on super-resolution, we formulate a compact and efficient network for seamless attenuation of different compression artifacts. We also demonstrate that a deeper model can be effectively trained with the features learned in a shallow network. Following a similar "easy to hard" idea, we systematically investigate several practical transfer settings and show the effectiveness of transfer learning in low-level vision problems. Our method shows superior performance than the state-of-the-arts both on the benchmark datasets and the real-world use case (i.e. Twitter). In addition, we show that our method can be applied as pre-processing to facilitate other low-level vision routines when they take compressed images as input.

View on arXiv

Authors (4)

Chao Dong (168 papers)
Yubin Deng (7 papers)
Chen Change Loy (288 papers)
Xiaoou Tang (73 papers)

Citations (765)

View on Semantic Scholar

Summary

Compression Artifacts Reduction by a Deep Convolutional Network: A Summary

The paper "Compression Artifacts Reduction by a Deep Convolutional Network" by Chao Dong, Yubin Deng, Chen Change Loy, and Xiaoou Tang addresses the issue of artifacts introduced by lossy compression algorithms like JPEG, WebP, and HEVC-MSP. The authors present a novel deep convolutional neural network (DCNN), termed the Artifacts Reduction Convolutional Neural Network (AR-CNN), designed specifically to tackle the prevalent visual degradations such as blocking artifacts, ringing effects, and blurring caused by these compression schemes.

Problem and Motivation

Lossy compression algorithms are critical in reducing data size for storage and transmission; however, they inevitably introduce artifacts that degrade image quality. These artifacts not only diminish visual perception but also adversely impact other image processing tasks like super-resolution and edge detection. Existing methods either focus on removing specific artifacts, which can lead to other types of degradation, or produce less satisfactory results by failing to address the compounded nature of these artifacts simultaneously.

Contributions

The contributions of this paper are threefold:

Novel Network Architecture: The AR-CNN incorporates a new architecture consisting of four convolutional layers that jointly optimize feature extraction, feature enhancement, mapping, and reconstruction. This layered design allows for a focused enhancement of extracted features, facilitating cleaner and sharper image reconstructions.
Transfer Learning in Low-Level Vision Tasks: The paper explores transfer learning techniques to ease the training of deeper networks in low-level vision tasks. Specifically, it explores transferring features from shallow to deeper models and from high-quality to low-quality compression settings, showing significant improvements in convergence rates and final performance.
Practical Applications: The AR-CNN demonstrates superior performance compared to state-of-the-art methods and is shown to be effective in real-world use cases, including as a preprocessing step to enhance the performance of other image processing routines when dealing with compressed images.

Methodology

The AR-CNN framework consists of four key layers:

Feature Extraction Layer: Initially extracts features from the input compressed image.
Feature Enhancement Layer: Refines the extracted features to suppress noise.
Mapping Layer: Maps the enhanced features to a high-dimensional representation.
Reconstruction Layer: Aggregates and reconstructs the final high-quality image.

The training process follows a systematic approach, optimizing the model using Mean Squared Error (MSE) and leveraging stochastic gradient descent with backpropagation.

Numerical Results and Experiments

The network's performance is benchmarked against state-of-the-art methods such as SA-DCT and RTF, as well as a baseline implementation of SRCNN. The AR-CNN consistently outperforms these methods across multiple metrics (PSNR, SSIM, PSNR-B) on standard datasets like LIVE1 and BSDS500. For instance, the AR-CNN improves PSNR by a notable margin, demonstrating its ability to reduce blockiness and enhance edge sharpness effectively.

In addition to these comparisons, the paper investigates several transfer learning settings:

Shallow to Deeper Models: Successfully initializing a deeper network using a pretrained shallow network, which otherwise struggles with traditional initialization.
High to Low Quality Compression: Utilizing features learned from high-quality compression tasks to initialize training for lower quality, more complex tasks, resulting in faster convergence.
Standard to Real Use Case: Transferring learned features from standard compression schemes to practical, real-use cases like the compression artifacts seen in images uploaded to Twitter.

Implications and Future Work

The AR-CNN's success in reducing a variety of compression artifacts has significant theoretical and practical implications. Theoretically, it demonstrates the potential of deep learning models in complex low-level vision tasks, which has traditionally been challenging. Practically, it provides a robust solution for improving image quality in numerous applications, potentially benefiting social media platforms, digital storage services, and image processing pipelines.

Future work could further improve the AR-CNN by integrating larger filter sizes and experimenting with additional architectural variations. Additionally, exploring its impact on other compression formats beyond JPEG might provide broader applicability and affirm its robustness in different contexts.

The paper exemplifies how deep learning can be tailored to address specific low-level vision problems effectively, pushing the boundaries of what these models can achieve in the field of image processing and restoration.

PDF Markdown

Related Papers

Find Related Papers