Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
158 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Label Noise Modeling and Loss Correction (1904.11238v2)

Published 25 Apr 2019 in cs.CV

Abstract: Despite being robust to small amounts of label noise, convolutional neural networks trained with stochastic gradient methods have been shown to easily fit random labels. When there are a mixture of correct and mislabelled targets, networks tend to fit the former before the latter. This suggests using a suitable two-component mixture model as an unsupervised generative model of sample loss values during training to allow online estimation of the probability that a sample is mislabelled. Specifically, we propose a beta mixture to estimate this probability and correct the loss by relying on the network prediction (the so-called bootstrapping loss). We further adapt mixup augmentation to drive our approach a step further. Experiments on CIFAR-10/100 and TinyImageNet demonstrate a robustness to label noise that substantially outperforms recent state-of-the-art. Source code is available at https://git.io/fjsvE

Citations (575)

Summary

  • The paper introduces a beta mixture model that discriminates between clean and noisy labels without needing a pre-cleaned dataset.
  • It proposes a dynamic bootstrapping loss that adjusts weightings between true labels and model predictions in response to noise probability.
  • Integration with mixup augmentation significantly boosts model accuracy and robustness, especially under high noise conditions.

Unsupervised Label Noise Modeling and Loss Correction

The paper "Unsupervised Label Noise Modeling and Loss Correction" presents a novel approach to address label noise in datasets, a prevalent issue in training convolutional neural networks (CNNs). The authors propose an unsupervised methodology leveraging a two-component beta mixture model (BMM) to discriminate between clean and noisy samples based on individual loss values. This model allows for online estimation of the probability that a given sample is mislabeled, enabling corrective interventions during the learning process.

Key Contributions

  1. Unsupervised Label Noise Modeling: The paper introduces a BMM to model the distribution of losses for clean and noisy samples, departing from traditional Gaussian mixtures that fail to capture the skewness in loss distributions. This unsupervised framework operates without requiring access to a clean subset of data, enhancing its applicability.
  2. Dynamic Bootstrapping Loss: By integrating the probabilities estimated from the BMM, the paper outlines a dynamic bootstrapping loss procedure. This approach adjusts the reliance on clean labels versus the model's predictions for each sample, contingent on the noise likelihood. Unlike static alternatives with fixed parameters, this method adapts in real time.
  3. Integration with Mixup Augmentation: Coupled with mixup data augmentation, the proposed method enhances robustness to label noise by linearly interpolating between pairs of data and their labels. This integration advances the state-of-the-art in handling severe label noise scenarios, achieving superior classification accuracy on benchmarks like CIFAR-10/100 and TinyImageNet.

Analysis and Results

The empirical analysis demonstrates that the dynamic bootstrapping approach significantly reduces the susceptibility of CNNs to overfit noisy labels. Specifically, the method yields large margins over baseline cross-entropy loss and static bootstrapping methods, particularly in high noise settings, such as 80% label noise in CIFAR-10, where it reported substantial gains in both best and last epoch performance.

Harnessing mixup alongside the BMM-driven bootstrapping shows even more pronounced improvements under challenging conditions. For instance, dynamic bootstrapping with mixup achieves notable robustness against noise levels, ensuring model convergence and reliable performance across extensive noise distributions.

Theoretical and Practical Implications

Theoretically, the paper provides insights into the potential of unsupervised mixture models in understanding and leveraging noise characteristics for robust learning. Practically, the BMM framework offers a flexible solution that can adapt to various datasets and architectures without necessitating parameter tuning or access to a noise-free subset.

Future Directions

Future exploration could involve extending the technique to other forms of noise beyond the closed-set scenario, such as open-set or structured noise typical in bigger, real-world datasets. Additionally, the paper opens avenues for further improving the efficiency of EM estimation in dynamic environments and expanding the approach to non-visual domains where label noise is prevalent.

Conclusion

Overall, the unsupervised label noise approach and corrective methodologies presented in this paper represent a significant advancement in robust machine learning under label corruption. By achieving high accuracy and stability over various configurations and datasets, this research establishes a new benchmark for addressing noisy labels in deep learning models.