Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Combating Label Noise in Deep Learning Using Abstention (1905.10964v2)

Published 27 May 2019 in stat.ML and cs.LG

Abstract: We introduce a novel method to combat label noise when training deep neural networks for classification. We propose a loss function that permits abstention during training thereby allowing the DNN to abstain on confusing samples while continuing to learn and improve classification performance on the non-abstained samples. We show how such a deep abstaining classifier (DAC) can be used for robust learning in the presence of different types of label noise. In the case of structured or systematic label noise -- where noisy training labels or confusing examples are correlated with underlying features of the data-- training with abstention enables representation learning for features that are associated with unreliable labels. In the case of unstructured (arbitrary) label noise, abstention during training enables the DAC to be used as an effective data cleaner by identifying samples that are likely to have label noise. We provide analytical results on the loss function behavior that enable dynamic adaption of abstention rates based on learning progress during training. We demonstrate the utility of the deep abstaining classifier for various image classification tasks under different types of label noise; in the case of arbitrary label noise, we show significant improvements over previously published results on multiple image benchmarks. Source code is available at https://github.com/thulas/dac-label-noise

Citations (171)

Summary

  • The paper introduces a novel abstention loss function that lets classifiers withhold judgment on ambiguous samples to mitigate label noise.
  • It employs an extra abstain label during training and inference, enabling the model to isolate and handle systematic noise.
  • Empirical results on CIFAR-10, CIFAR-100, and Fashion-MNIST show the Deep Abstaining Classifier significantly outperforms standard models in noisy settings.

Combating Label Noise in Deep Learning Using Abstention

The paper "Combating Label Noise in Deep Learning Using Abstention" presents a novel mechanism to address the systemic problem of label noise in deep neural networks (DNNs) through a strategy termed abstention. At its core, the paper introduces an abstention-based approach that allows a classifier to abstain from making predictions on samples where label noise may lead to uncertainty, thereby enhancing the classifier's performance on cleaner samples.

The central innovation of this work is an abstention loss function that integrates a strategic withdrawal from classification when encountering ambiguous or noisy data. This approach operates both during training and inference, diverging from traditional methods that often apply abstention only as post-processing. The abstention mechanism is particularly potent in scenarios featuring structured or systematic noise, as it enables the DNN to recognize and learn features linked to unreliable training labels. This is achieved through an explicit "abstain" label added to the traditional set of output classes, where the network learns to defer judgment rather than risk a misclassification based on inaccurate labels.

The paper provides thorough theoretical analysis and empirical evidence supporting the efficacy of the proposed method across various types of label noise, structured and unstructured alike. In structured noise cases, where patterns of errors correlate with inherent data features, the Deep Abstaining Classifier (DAC) can isolate these patterns through its loss adaptation dynamics, allowing for a nuanced representation learning that preempts potential misclassifications. For arbitrary label noise, the DAC serves as a robust data cleaner, adept at flagging samples with probable label corruption, therefore optimizing downstream models trained with the purified dataset.

Noteworthy experimental results are detailed across multiple benchmarks including CIFAR-10, CIFAR-100, and Fashion-MNIST. The DAC consistently outperforms existing models, specifically in scenarios of high label noise, achieving superior accuracy by significant margins — sometimes recovering close to the "oracle performance" obtained with perfect a priori noise knowledge.

The implications of this research extend both to theoretical advancements in loss function design for classification tasks and practical applications in data-intensive fields. It positions the DAC as an attractive option in large-scale machine learning deployments where the quality and reliability of labels cannot always be guaranteed. While the paper does not address adversarial settings, the theoretical underpinnings suggest that abstention could potentially strengthen defenses against adversarial perturbations — an avenue ripe for future exploration.

Moving forward, this abstention mechanism could be integrated with other model architectures and domains, suggesting broader applicability beyond image classification tasks. The simplicity of the DAC's design—requiring only modification of the loss function—presents a versatile tool that could be seamlessly incorporated into current DNN frameworks to bolster robustness against label noise. As the AI community continues to grapple with the complexities of real-world data environments, strategies like abstention may well become pivotal in reinforcing the robustness and reliability of deep learning models.

Github Logo Streamline Icon: https://streamlinehq.com