High-Quality Self-Supervised Deep Image Denoising (1901.10277v3)

Published 29 Jan 2019 in cs.LG, cs.CV, cs.NE, and stat.ML

Abstract: We describe a novel method for training high-quality image denoising models based on unorganized collections of corrupted images. The training does not need access to clean reference images, or explicit pairs of corrupted images, and can thus be applied in situations where such data is unacceptably expensive or impossible to acquire. We build on a recent technique that removes the need for reference data by employing networks with a "blind spot" in the receptive field, and significantly improve two key aspects: image quality and training efficiency. Our result quality is on par with state-of-the-art neural network denoisers in the case of i.i.d. additive Gaussian noise, and not far behind with Poisson and impulse noise. We also successfully handle cases where parameters of the noise model are variable and/or unknown in both training and evaluation data.

Authors (4)

Samuli Laine (21 papers)
Tero Karras (26 papers)
Jaakko Lehtinen (23 papers)
Timo Aila (23 papers)

Citations (310)

View on Semantic Scholar

Summary

The paper presents a self-supervised denoising method that eliminates the need for clean reference images and achieves PSNR performance nearly equivalent to supervised approaches.
It employs a novel blind-spot network architecture with multiple branches and 1x1 convolutions to restrict receptive fields and boost training efficiency.
The approach leverages Bayesian inference to model noise types without pre-specified parameters, demonstrating robust performance across Gaussian, Poisson, and impulse noise.

High-Quality Self-Supervised Deep Image Denoising

The paper presents a novel self-supervised method for training deep image denoising models without the need for clean reference images, allowing practical deployment in scenarios where acquiring such clean data is a challenge. This work builds upon the N2N (Noise2Noise) framework by using individual corrupted images, enhancing both image quality and training efficiency compared to traditional supervised methods that rely on paired noisy-clean datasets.

The authors introduce convolutional blind-spot network architectures, which are designed to ensure the neural network's receptive field does not include the center pixel, a critical aspect for self-supervised learning in the proposed approach. This characteristic distinguishes it from previous works like Krull et al.'s N2V (Noise2Void), which employed a masking scheme and suffered from reduced efficiency and compromised denoising quality.

Technical Contributions

Blind-Spot Network Architecture: By constructing a network with multiple branches and restricting each branch's receptive field to a half-plane without the center pixel, they facilitate efficient training. The architecture efficiently combines outputs from all branches using 1×1 convolutions, allowing all output pixels to contribute to the loss functionakin to conventional training.
Self-Supervised Bayesian Denoising: The employed method leverages Bayesian inference to enhance denoising by incorporating a model of the corruption process, representing the noise types (Gaussian, Poisson, impulse) as i.i.d. variables. The model estimates the clean pixel value as the posterior mean, incorporating both neighborhood context and corruption likelihoods, leading to a near-equivalent performance as supervised approaches.
Robustness to Noise Model and Parameters: This approach does not require prior knowledge of noise model parameters and can adapt and learn them during the training process. Notably, the paper demonstrates that their method performs nearly on par with state-of-the-art denoisers for Gaussian noise, with modest concessions for Poisson and impulse noise.

Empirical Results

The experimental evaluation confirms the efficacy of the proposed method across Gaussian, Poisson, and impulse noise models. For Gaussian noise with σ=25, the proposed method achieves an average PSNR of 31.57 dB, closely matching the supervised baseline 31.60 dB (N2C). Moreover, experiments highlight the flexibility of the approach to work with variable noise parameters and unknown noise levels, outperforming non-learned baselines like CBM3D.

Implications and Future Directions

The design and results highlight the potential of self-supervised learning in image restoration tasks where clean data is impractical to obtain. The method paves the way for incorporating more complex real-world corruption models without the need for explicit ground truth data. Future work could focus on extending the method to handle more sophisticated noise types found in practical applications, aiming towards adaptive and robust image restoration pipelines directly applicable to real-life datasets, including medical or astronomical imaging data. The approach’s construct can substantially benefit scenarios where dataset tailoring to different application domains is crucial, ensuring that the learned denoising models encapsulate both the data distribution and varying noise characteristics unique to their intended application.

This paper demonstrates a significant step forward for utilizing self-supervised learning methodologies in noise reduction, particularly applicable to uncurated data sets, paving the way for more inclusive and versatile AI applications in imaging.

PDF Markdown