Multi-level Wavelet-CNN for Image Restoration (1805.07071v2)

Published 18 May 2018 in cs.CV

Abstract: The tradeoff between receptive field size and efficiency is a crucial issue in low level vision. Plain convolutional networks (CNNs) generally enlarge the receptive field at the expense of computational cost. Recently, dilated filtering has been adopted to address this issue. But it suffers from gridding effect, and the resulting receptive field is only a sparse sampling of input image with checkerboard patterns. In this paper, we present a novel multi-level wavelet CNN (MWCNN) model for better tradeoff between receptive field size and computational efficiency. With the modified U-Net architecture, wavelet transform is introduced to reduce the size of feature maps in the contracting subnetwork. Furthermore, another convolutional layer is further used to decrease the channels of feature maps. In the expanding subnetwork, inverse wavelet transform is then deployed to reconstruct the high resolution feature maps. Our MWCNN can also be explained as the generalization of dilated filtering and subsampling, and can be applied to many image restoration tasks. The experimental results clearly show the effectiveness of MWCNN for image denoising, single image super-resolution, and JPEG image artifacts removal.

Citations (611)

View on Semantic Scholar

Summary

The paper introduces a novel MWCNN architecture that combines discrete wavelet transforms with CNNs to boost image restoration tasks.
It efficiently expands the receptive field while reducing computational load, outperforming state-of-the-art methods in PSNR and SSIM.
Experiments on benchmark datasets reveal gains of up to 1.2dB PSNR, highlighting the practical benefits of integrating classical signal processing with deep learning.

Multi-level Wavelet-CNN for Image Restoration

The paper presents a multi-level wavelet convolutional neural network (MWCNN) model as an enhanced architecture for image restoration tasks, such as image denoising, single image super-resolution (SISR), and JPEG artifact removal. The authors aim to address the tradeoff between receptive field size and computational efficiency, a common challenge in the development of convolutional neural networks (CNNs) for low-level vision tasks.

Core Contributions

The paper introduces a novel integration of wavelet transforms with CNNs, optimizing the U-Net architecture to achieve superior performance in image restoration by leveraging the discrete wavelet transform (DWT) and inverse wavelet transform (IWT). The MWCNN architecture is designed with a contracting subnetwork, where the DWT is used to downsample feature maps effectively, while an expanding subnetwork employs IWT to reconstruct high-resolution feature maps. This architecture enables the model to maintain a large receptive field, crucial for improved restoration performance, without incurring the computational costs typical of traditional methods that rely on deeper networks or dilated filters.

Key Numerical Results

The experimental results demonstrate that MWCNN significantly surpasses previous approaches in terms of peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) across multiple datasets: Set12, BSD68, and Urban100 for image denoising; Set5, Set14, BSD100, and Urban100 for SISR; and Classic5 and LIVE1 for JPEG artifact removal. Notably, the MWCNN model presents an appreciable improvement in PSNR of up to 0.5dB to 1.2dB compared to state-of-the-art models such as DnCNN and MemNet, especially at higher noise levels and larger scaling factors, demonstrating its efficacy in preserving detail and reducing computational burden.

Theoretical and Practical Implications

The model's incorporation of wavelet transforms allows it to efficiently handle the increase in spatial context required for precise image restoration, all while maintaining computational efficiency. This approach is indicative of a broader trend in machine learning: the integration of classical signal processing techniques with deep learning models to enhance performance in nuanced tasks. The ability to retain detailed frequency and location information, inherent in wavelet transforms, shows promise in mitigating the loss of fine textures commonly seen in image restoration.

Speculation on Future Developments

Given the success of the MWCNN in various image restoration tasks, its versatile architecture may inspire further exploration into higher-level vision tasks, such as image classification or segmentation, where retaining detailed spatial information is beneficial. An exciting avenue for future work could involve extending this wavelet-integrated approach to address more complex restoration scenarios, such as image deblurring or blind deconvolution, where traditional CNN architectures struggle due to their convolution limitations.

Overall, the paper presents a compelling case for the use of wavelet transforms within CNN architectures, setting a precedent for future research endeavors in expanding the capabilities and efficiencies of neural networks in image processing domains.