Noise2Self: Blind Denoising by Self-Supervision (1901.11365v2)

Published 30 Jan 2019 in cs.CV, cs.LG, and stat.ML

Abstract: We propose a general framework for denoising high-dimensional measurements which requires no prior on the signal, no estimate of the noise, and no clean training data. The only assumption is that the noise exhibits statistical independence across different dimensions of the measurement, while the true signal exhibits some correlation. For a broad class of functions ("$\mathcal{J}$-invariant"), it is then possible to estimate the performance of a denoiser from noisy data alone. This allows us to calibrate $\mathcal{J}$-invariant versions of any parameterised denoising algorithm, from the single hyperparameter of a median filter to the millions of weights of a deep neural network. We demonstrate this on natural image and microscopy data, where we exploit noise independence between pixels, and on single-cell gene expression data, where we exploit independence between detections of individual molecules. This framework generalizes recent work on training neural nets from noisy images and on cross-validation for matrix factorization.

Citations (589)

View on Semantic Scholar

Summary

The paper introduces a novel self-supervised denoising approach that leverages J-invariant functions to predict target dimensions from correlated signals.
It broadens applicability by training solely on noisy data, achieving competitive performance across imaging and genomics applications.
The framework’s reliance on statistical independence over explicit noise models offers practical advantages in scenarios lacking clean or multiple measurements.

The article introduces a novel framework for denoising high-dimensional data without relying on clean data or an explicit noise model. The authors present a self-supervised approach named Noise2Self, which leverages statistical independence assumptions in noise to achieve effective denoising. This approach is particularly innovative as it formulates a denoising task purely on noisy data by exploiting the inherent structure and correlations in the true signal.

Key Contributions

The primary contribution is the introduction of $\mathcal{J}$ -invariant functions, which are pivotal in the Noise2Self framework. A function is defined as $\mathcal{J}$ -invariant if predictions for a subset of dimensions do not depend on the values of those dimensions. This concept allows for self-supervised learning by enforcing that a denoiser uses correlated dimensions to predict each target dimension, thus averaging out noise.

Framework Generalization: This self-supervised framework is a significant generalization of previous works like Noise2Noise, expanding applicability to cases where multiple measurements of the same target are not available. This makes it highly versatile for fields like microscopy and genomic data analysis.
Diverse Application Domains: Noise2Self demonstrates effectiveness on various data types, including natural images, microscopy images, and single-cell RNA sequencing data. It adapitates to different types of noise structures by appropriately defining dimension partitions.
Training Without Clean Data: The approach obviates the need for clean training data, a substantial advantage in practical applications where obtaining such data is often challenging.

Methodological Insights

The authors propose minimizing a self-supervised loss over $\mathcal{J}$ -invariant functions. Notably, they prove that under appropriate statistical independence assumptions, minimizing this loss effectively identifies an optimal denoiser:

The self-supervised loss comprises the typical supervised loss and the variance of the noise, thus guiding the model to learn useful denoising strategies without explicit clean targets.
Distinct from many classical denoising techniques requiring detailed noise models, this approach relies on the general statistical property of independence, which is more broadly applicable.

Results and Implications

Experimental results show that Noise2Self-trained models achieve competitive performance compared to traditional supervised and unsupervised denoising methods. For example, on datasets like ImageNet and CellNet, Noise2Self maintains performance close to methods trained with clean targets. The authors extend the methodology with deep learning architectures like UNet and DnCNN, effectively integrating the $\mathcal{J}$ -invariant framework to enhance neural network resilience to noise without supervised signals.

Practical and Theoretical Implications

Practical Utility: The framework is highly practical for real-world applications in imaging and genomics, where noise is typically substantial, and crafting detailed noise models is infeasible.
Theoretical Impact: The methodology challenges traditional paradigms of reliance on clean datasets for supervised learning, paving the way for more robust and flexible self-supervised frameworks in machine learning.

Future Directions

The paper opens several pathways for future research, especially in optimizing the choice of partitions for $\mathcal{J}$ sets based on domain-specific noise structures, potentially improving performance further. Additionally, exploration in other domains such as sensor networks and astronomical observations could benefit from this framework.

In summary, Noise2Self represents a meaningful advance in the field of noise reduction in high-dimensional data, providing a robust, adaptable solution that bypasses some of the limitations associated with traditional denoising frameworks.

PDF Markdown

Related Papers

GitHub

GitHub - czbiohub-sf/noise2self: A framework for blind denoising with self-supervision. (354 stars)

YouTube

Show All Videos