Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning to Learn from Noisy Labeled Data (1812.05214v2)

Published 13 Dec 2018 in cs.LG, cs.CV, and stat.ML

Abstract: Despite the success of deep neural networks (DNNs) in image classification tasks, the human-level performance relies on massive training data with high-quality manual annotations, which are expensive and time-consuming to collect. There exist many inexpensive data sources on the web, but they tend to contain inaccurate labels. Training on noisy labeled datasets causes performance degradation because DNNs can easily overfit to the label noise. To overcome this problem, we propose a noise-tolerant training algorithm, where a meta-learning update is performed prior to conventional gradient update. The proposed meta-learning method simulates actual training by generating synthetic noisy labels, and train the model such that after one gradient update using each set of synthetic noisy labels, the model does not overfit to the specific noise. We conduct extensive experiments on the noisy CIFAR-10 dataset and the Clothing1M dataset. The results demonstrate the advantageous performance of the proposed method compared to several state-of-the-art baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Junnan Li (56 papers)
  2. Yongkang Wong (38 papers)
  3. Qi Zhao (181 papers)
  4. Mohan Kankanhalli (117 papers)
Citations (320)

Summary

Learning to Learn from Noisy Labeled Data

The paper "Learning to Learn from Noisy Labeled Data" addresses a pertinent challenge in machine learning: training deep neural networks (DNNs) with datasets that have noisy labels. This issue is particularly significant given the ubiquity of readily available, yet noisily labeled data from sources like social media and other web-based repositories. The crux of the problem lies in the tendency of DNNs to overfit on such noise, leading to degraded performance in classification tasks.

The authors propose a novel approach that leverages meta-learning to enhance the noise-tolerance of models during training. The primary innovation is the introduction of a meta-learning based noise-tolerant training (MLNT) algorithm, which comprises a meta-learning update executed before the conventional gradient update. This meta-learning strategy equips the model with parameters that are inherently less susceptible to noise overfitting. The method hinges on simulating actual training processes with synthetic noisy labels and optimizing the model such that it maintains robustness against this noise.

The implementation of the proposed method is detailed meticulously, with a meta-objective that enforces consistency between predictions from the updated model parameters (post-synthetic noise introduction) and predictions from a teacher model, which is unaffected by the synthetic noise. This teacher model employs a self-ensembling strategy, using exponential moving averages of the student model's parameters to generate reliable predictions.

Through extensive experimentation on the CIFAR-10 dataset (both symmetrically and asymmetrically corrupted) and the Clothing1M dataset, which features real-world noisy labels, the proposed MLNT method reveals its robustness and effectiveness. The results indicate a substantial improvement in classification accuracy compared to established baselines. For instance, the paper documents consistent enhancement of test accuracies in CIFAR-10 under various noise conditions, substantiating the efficacy of the MLNT approach.

The authors also conduct a comprehensive ablation paper to assess the sensitivity of the model to various hyper-parameters, providing insight into the adaptability and stability of their approach. It is noteworthy that even without any prior knowledge of noise characteristics, the model demonstrated impressive performance, outperforming methods that leverage such information.

Furthermore, the iterative training scheme offered in this paper empowers the learning process by progressively refining the data and predictions. This scheme involves developing a mentor model to clean the dataset during training iterations, advancing the efficacy of the proposed method.

In terms of implications, this work contributes significantly to the domain by offering a model-agnostic approach that is applicable beyond the specifics of image classification to potentially other domains facing noisy data challenges, such as natural language processing or time series analysis.

Future research might extend this methodology towards other model architectures and investigate novel combinations of meta-learning tasks to further enhance noise resilience. As the evolution of large-scale datasets continues, approaches such as these that maintain performance amidst label imperfections will become increasingly vital to the development of robust AI systems.