Enhanced Deep Residual Networks for Single Image Super-Resolution (1707.02921v1)

Published 10 Jul 2017 in cs.CV

Abstract: Recent research on super-resolution has progressed with the development of deep convolutional neural networks (DCNN). In particular, residual learning techniques exhibit improved performance. In this paper, we develop an enhanced deep super-resolution network (EDSR) with performance exceeding those of current state-of-the-art SR methods. The significant performance improvement of our model is due to optimization by removing unnecessary modules in conventional residual networks. The performance is further improved by expanding the model size while we stabilize the training procedure. We also propose a new multi-scale deep super-resolution system (MDSR) and training method, which can reconstruct high-resolution images of different upscaling factors in a single model. The proposed methods show superior performance over the state-of-the-art methods on benchmark datasets and prove its excellence by winning the NTIRE2017 Super-Resolution Challenge.

Authors (5)

Bee Lim (1 paper)
Sanghyun Son (24 papers)
Heewon Kim (12 papers)
Seungjun Nah (17 papers)
Kyoung Mu Lee (107 papers)

Citations (5,460)

View on Semantic Scholar

Summary

Enhanced Deep Residual Networks for Single Image Super-Resolution

The paper entitled "Enhanced Deep Residual Networks for Single Image Super-Resolution" by Lim et al. addresses the task of recovering high-resolution images from single low-resolution inputs using deep convolutional neural networks (DCNNs). The work makes substantial advancements in the architecture and training methods of residual networks tailored specifically for super-resolution tasks.

Overview

The authors introduce an Enhanced Deep Super-Resolution Network (EDSR) and a Multi-Scale Deep Super-Resolution system (MDSR) that outperform existing methods regarding Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). The performance improvements are achieved by refining the traditional residual network by removing unnecessary modules and expanding the model size while stabilizing training.

Key Contributions

Enhanced Deep Super-Resolution Network (EDSR):
- The authors optimize the SRResNet architecture by removing batch normalization layers and implementing residual scaling, resulting in superior performance.
- The EDSR model removes Relu activation layers outside the residual blocks and employs residual blocks without batch normalization to boost efficiency, as normalization layers remove range flexibility.
- The model exhibits significant memory savings and computational efficiency improvements, with the EDSR achieving state-of-the-art results on benchmark datasets, including the DIV2K dataset.
Multi-Scale Deep Super-Resolution (MDSR):
- The EDSR is extended to the MDSR model to handle various super-resolution scales through a single unified framework.
- MDSR introduces scale-specific pre-processing and upsampling modules while sharing most parameters across different scales, leveraging inter-scale relationships to reduce model size.
- The multi-scale model demonstrates competitive performance with significantly fewer parameters than a set of scale-specific models.

Performance and Implications

Quantitative results show the effectiveness of the proposed architectures. On the DIV2K validation set, EDSR+ achieved PSNR/SSIM of up to 35.12dB/0.9699 for 2x upscaling, demonstrating substantial improvements over benchmark methods like SRResNet, achieving more than 1dB gain in PSNR for higher scaling factors.

Various experimental setups reveal important aspects:

Removal of Batch Norm Layers: This leads to increased performance because batch normalization layers may introduce additional complexity and computational overhead.
Residual Scaling: Stabilizes training when using a large number of filters, as highlighted by improved convergence rates.
Pre-training Strategy: Facilitates faster convergence and achieves higher performance when models intended for higher upscaling factors are initialized from pre-trained lower-scale models.

Theoretical and Practical Implications

From a theoretical viewpoint, the paper underscores the importance of refined network design and tailored training strategies for specific tasks like super-resolution. The insights into model simplification without compromising capacity and performance are particularly valuable for extending similar approaches to other low-level vision problems.

Practically, the efficiency improvements in EDSR and MDSR make these models highly suitable for real-world applications where computational resources and performance trade-offs are critical. The model's success in the NTIRE 2017 Super-Resolution Challenge attests to its practical viability and superiority.

Future Directions

Future research could explore further architectural enhancements and training strategies. Given the promising results, integrating other forms of prior knowledge or constraints, exploiting domain-specific information, or developing more scalable multi-task frameworks could push the boundaries of super-resolution techniques further.

Overall, the work by Lim et al. presents a robust framework for single image super-resolution, offering substantial improvements over existing methods and paving the way for future advancements in the field. The proposed models' blend of efficiency, performance, and versatility positions them well within the broader scope of image restoration and enhancement applications in modern computer vision.

PDF Markdown

Related Papers

YouTube

Show All Videos