Toward Real-World Single Image Super-Resolution: A New Benchmark and A New Model (1904.00523v1)

Published 1 Apr 2019 in cs.CV

Abstract: Most of the existing learning-based single image superresolution (SISR) methods are trained and evaluated on simulated datasets, where the low-resolution (LR) images are generated by applying a simple and uniform degradation (i.e., bicubic downsampling) to their high-resolution (HR) counterparts. However, the degradations in real-world LR images are far more complicated. As a consequence, the SISR models trained on simulated data become less effective when applied to practical scenarios. In this paper, we build a real-world super-resolution (RealSR) dataset where paired LR-HR images on the same scene are captured by adjusting the focal length of a digital camera. An image registration algorithm is developed to progressively align the image pairs at different resolutions. Considering that the degradation kernels are naturally non-uniform in our dataset, we present a Laplacian pyramid based kernel prediction network (LP-KPN), which efficiently learns per-pixel kernels to recover the HR image. Our extensive experiments demonstrate that SISR models trained on our RealSR dataset deliver better visual quality with sharper edges and finer textures on real-world scenes than those trained on simulated datasets. Though our RealSR dataset is built by using only two cameras (Canon 5D3 and Nikon D810), the trained model generalizes well to other camera devices such as Sony a7II and mobile phones.

Authors (5)

Jianrui Cai (5 papers)
Hui Zeng (41 papers)
Hongwei Yong (12 papers)
Zisheng Cao (7 papers)
Lei Zhang (1689 papers)

Citations (445)

View on Semantic Scholar

Summary

Toward Real-World Single Image Super-Resolution: A New Benchmark and A New Model

This paper addresses a fundamental shortcoming of current Single Image Super-Resolution (SISR) methodologies, the lack of datasets and models that adequately simulate real-world conditions. Traditional SISR models often rely on simulated datasets created via simple downsampling techniques like bicubic interpolation, which do not capture the complex degradation processes observed in real-world imaging. Recognizing this gap, the authors introduce the RealSR dataset, which contains high-resolution and low-resolution image pairs derived from real-world scenarios using DSLR cameras. Additionally, they propose a novel Laplacian Pyramid based Kernel Prediction Network (LP-KPN) to enhance SISR performance on real-world images.

Contributions

RealSR Dataset: The RealSR dataset marks a significant step in super-resolution research by providing a benchmark with real-world high-low resolution image pairs. Captured under natural conditions with two DSLR cameras—Canon 5D3 and Nikon D810—this dataset contains a comprehensive variety of scenes, including both indoor and outdoor environments. The authors implement an image registration process to ensure accurate alignment of these image pairs, addressing common issues like lens distortion and exposure differences. This dataset challenges existing architectures to generalize better to real-world degradations.
Laplacian Pyramid based Kernel Prediction Network (LP-KPN): The proposed LP-KPN system enhances SISR by predicting per-pixel restoration kernels through a Laplacian pyramid framework. This approach permits the use of smaller kernels while processing images, granting an efficient model capable of capturing spatially varying degradations without the computational drawbacks of larger kernels. The LP-KPN demonstrates the potential for increased performance with reduced memory and computational demands compared to more extensive models such as the RCAN network, a contemporary state-of-the-art SISR model.

Results

Empirical evaluations show that models trained on the RealSR dataset significantly outperform those trained on simulated datasets across standard architectures like VDSR, SRResNet, and RCAN in real-world super-resolution tasks. Notably, the LP-KPN architecture achieves superior image quality while requiring fewer computational resources. The cross-camera evaluation further highlights the trained model's robustness and generalization capability, performing effectively even on images captured by devices not included in the dataset, such as smartphone cameras.

Implications

The implications of this research span both theoretical and practical domains:

From a theoretical perspective, the introduction of a real-world benchmark challenges the research community to devise more adaptive and generalized learning architectures. Moreover, the proposed LP-KPN model sets a new direction in employing efficient pyramid-based architectures for pixel-level adaptations.
On the practical side, this research paves the way for more effective deployment of super-resolution techniques in everyday applications, including mobile imaging and video enhancement. The demonstrated generalization to different camera devices suggests immediate applicability across a wide range of consumer electronics, advancing the quality of digital media consumed and produced daily.

Future Directions

Moving forward, expanding the RealSR dataset to include a wider array of devices and environmental conditions would fortify its comprehensiveness as a training and evaluation resource. Furthermore, continued exploration into the efficiency and efficacy of kernel prediction models in diverse applications remains a fertile area for future research. Models like LP-KPN represent a sophisticated synthesis of spatial information, which could extend beyond super-resolution to other domains within image and video processing within Artificial Intelligence.

Overall, this paper contributes substantial advancements in resources and methodologies for real-world image super-resolution, offering both immediate benefits and avenues for further exploration in addressing the challenges innate to this perceptual task.

PDF Markdown