Learning with Instance-Dependent Label Noise: A Sample Sieve Approach (2010.02347v2)

Published 5 Oct 2020 in cs.LG and stat.ML

Abstract: Human-annotated labels are often prone to noise, and the presence of such noise will degrade the performance of the resulting deep neural network (DNN) models. Much of the literature (with several recent exceptions) of learning with noisy labels focuses on the case when the label noise is independent of features. Practically, annotations errors tend to be instance-dependent and often depend on the difficulty levels of recognizing a certain task. Applying existing results from instance-independent settings would require a significant amount of estimation of noise rates. Therefore, providing theoretically rigorous solutions for learning with instance-dependent label noise remains a challenge. In this paper, we propose CORES$^{2}$ (COnfidence REgularized Sample Sieve), which progressively sieves out corrupted examples. The implementation of CORES$^{2}$ does not require specifying noise rates and yet we are able to provide theoretical guarantees of CORES$^{2}$ in filtering out the corrupted examples. This high-quality sample sieve allows us to treat clean examples and the corrupted ones separately in training a DNN solution, and such a separation is shown to be advantageous in the instance-dependent noise setting. We demonstrate the performance of CORES$^{2}$ on CIFAR10 and CIFAR100 datasets with synthetic instance-dependent label noise and Clothing1M with real-world human noise. As of independent interests, our sample sieve provides a generic machinery for anatomizing noisy datasets and provides a flexible interface for various robust training techniques to further improve the performance. Code is available at https://github.com/UCSC-REAL/cores.

Citations (180)

View on Semantic Scholar

Summary

The paper introduces CORES², which leverages a confidence-regularized loss to filter noisy labels without requiring noise rate estimation.
It employs a dynamic sample sieve mechanism that distinguishes clean examples from corrupted ones using adaptive thresholds.
Experimental results on CIFAR and Clothing1M datasets show that the approach significantly enhances model accuracy in noisy environments.

Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

The paper "Learning with Instance-Dependent Label Noise: A Sample Sieve Approach" addresses the persistent challenge of label noise in deep neural network (DNN) training, specifically when such noise is instance-dependent. Instance-dependent label noise, where annotation errors correlate directly with specific sample features, presents a significant complication for model training, exacerbating the tendency of DNNs to overfit to incorrect labels. Traditional approaches mainly address feature-independent noise, often necessitating intricate estimates of noise rates. This paper introduces a more adaptable method that circumvents the estimation of noise rates while maintaining theoretical rigor.

Synopsis of CORES $^2$

The authors propose the CORES $^2$ (COnfidence REgularized Sample Sieve) methodology, which systematically eliminates noisy samples to enhance the robustness of DNN training under instance-dependent label noise. This technique applies a confidence regularization term, denoted as $\ell_{\text{CR}}$ , which encourages the model to generate more confident predictions, counteracting the influence of noisy data. The confidence regularization term operates by penalizing prediction distributions that deviate significantly from certainty, thus reducing the model's propensity to incorporate erroneous label signals.

Theoretical and Methodological Contributions

Confidence Regularization: The incorporation of $\ell_{\text{CR}}$ provides robust theoretical underpinnings, ensuring that minimizing the confidence-regularized cross-entropy loss aligns closely with minimizing the loss over a clean instance distribution. This approach draws inspiration from peer loss functions and aims to improve model confidence in its predictions, pushing away from the degradation caused by noise.
Dynamic Sample Sieve: CORES $^2$ deploys a dynamic sample sieve mechanism, filtering out corrupted instances based on a calculated threshold, $\alpha_n$ , which is dependent on the current state of the model. This technique enables separation between clean and corrupted examples without requiring prior knowledge or estimation of the noise rate—a significant advantage over existing sample selection methods that rely heavily on noise rate assumptions.
Experimental Validation: The paper demonstrates the efficacy of CORES $^2$ on both synthetic datasets like CIFAR-10 and CIFAR-100, as well as real-world datasets such as Clothing1M, illustrating its performance improvement under various noise settings. Particularly, the method maintains high accuracy in the face of instance-based noise scenarios that challenge conventional methods.
Robustness Analysis: Through rigorous derivation and experimentation, the authors establish conditions under which the sample sieve operation guarantees correct identification of clean samples, assuming a well-defined set of feature-importance and noise-rate bounds (e.g., the requirement that $T_{ii}(X) - T_{ij}(X) > 0$ ).

Implications and Future Directions

This research offers substantial contributions to the field of machine learning, specifically in refining techniques to handle training data contaminated by instance-dependent noise. CORES $^2$ serves as a potent framework capable of incorporating additional strategies for model robustness enhancement, such as semi-supervised learning techniques, further extending its utility in noisy real-world situations.

In future developments, it might be advantageous to explore adaptive noise modeling approaches that dynamically tailor regularization terms within DNN architectures to further mitigate the impact of noisy labels. This could involve enhanced techniques for real-time estimation of noise characteristics or integrating more sophisticated semi-supervised learning mechanisms.

Conclusion

CORES $^2$ signifies a pivotal step in learning from noisily labeled data, particularly in complex, real-world scenarios where label noise is intricately tied to sample-specific characteristics. By effectively separating clean from corrupted examples through a progressive, confidence-regularized sieve process, this technique advances the robustness of DNN models against detrimental noise, promising heightened accuracy and generalization in various computational tasks.

PDF Markdown

Related Papers

GitHub

GitHub - UCSC-REAL/cores: Learning with Instance-Dependent Label Noise: A Sample Sieve Approach (ICLR2021) (34 stars)