Meta Label Correction for Noisy Label Learning (1911.03809v2)

Published 10 Nov 2019 in cs.LG and stat.ML

Abstract: Leveraging weak or noisy supervision for building effective machine learning models has long been an important research problem. Its importance has further increased recently due to the growing need for large-scale datasets to train deep learning models. Weak or noisy supervision could originate from multiple sources including non-expert annotators or automatic labeling based on heuristics or user interaction signals. There is an extensive amount of previous work focusing on leveraging noisy labels. Most notably, recent work has shown impressive gains by using a meta-learned instance re-weighting approach where a meta-learning framework is used to assign instance weights to noisy labels. In this paper, we extend this approach via posing the problem as label correction problem within a meta-learning framework. We view the label correction procedure as a meta-process and propose a new meta-learning based framework termed MLC (Meta Label Correction) for learning with noisy labels. Specifically, a label correction network is adopted as a meta-model to produce corrected labels for noisy labels while the main model is trained to leverage the corrected labeled. Both models are jointly trained by solving a bi-level optimization problem. We run extensive experiments with different label noise levels and types on both image recognition and text classification tasks. We compare the reweighing and correction approaches showing that the correction framing addresses some of the limitation of reweighting. We also show that the proposed MLC approach achieves large improvements over previous methods in many settings.

Authors (3)

Guoqing Zheng (25 papers)
Ahmed Hassan Awadallah (50 papers)
Susan Dumais (6 papers)

Citations (161)

View on Semantic Scholar

Summary

Meta Label Correction for Noisy Label Learning

This paper presents the Meta Label Correction (MLC) approach, a novel framework designed to improve the learning process from datasets with noisy labels. Noisy labels, a prevalent issue in machine learning, can stem from various sources such as non-expert annotators or heuristic-based automatic labeling systems. The problem is exacerbated by the growing demand for large-scale datasets necessary to train deep learning models effectively. While existing methods primarily focus on re-weighting instances to mitigate the effects of noise, MLC offers a more refined strategy by aiming to correct noisy labels within a meta-learning framework.

Meta Label Correction Framework

The MLC framework posits label correction as a meta-process in a bi-level optimization problem. The core idea involves deploying a label correction network (LCN) as a meta-model to produce corrected labels from noisy ones. These corrected labels are then used to train a primary model. Both models are co-trained via a bi-level optimization procedure, allowing the LCN to optimize label correction concurrently with the primary model training.

Meta-learning Approach: The use of meta-learning distinguishes MLC from previous methods. Unlike re-weighting techniques, which merely adjust the importance of data instances, MLC leverages meta-learning to actively alter the labels, acknowledging data dependencies and avoiding assumptions about noise generation processes.
Bi-level Optimization: The process is structured as a bi-level optimization problem, where the upper layer involves learning meta parameters for label corrections, while the lower layer focuses on learning main model parameters using corrected labels. This simultaneous learning strategy enables feedback between label correction and model training, enhancing the robustness of the learning process against noisy data.

Experimental Evaluation

The authors conducted extensive experiments on CIFAR-10, CIFAR-100, Clothing1M, and several large-scale text datasets, testing different noise levels and types, including real-world noisy labels. Across all experimental setups, MLC consistently outperformed established state-of-the-art methods such as GLC (Gold Loss Correction) and MW-Net (Meta-weight-net), highlighting its effectiveness in correcting labels rather than merely re-weighting instances.

Robustness to Noise: In scenarios with varying noise levels, MLC demonstrated superior accuracy and robustness, particularly in severe noise conditions where traditional re-weighting methods falter.
Practical Performance: On real-world datasets like Clothing1M, which contain noise-corrupted labels generated from user tags, MLC showed significant improvements in classification accuracy, showcasing its practical applicability.

Implications and Future Directions

The implications of the MLC approach are significant, providing a pathway for more accurate model training in noisy data environments. By integrating label correction into the meta-learning framework, MLC not only improves the performance of machine learning models but also offers a scalable solution to handle large, noisy datasets effectively.

Future developments in this domain could explore deeper integration of meta-learning techniques with varied machine learning architectures and further optimizations in computational efficiency. Additionally, expanding this framework to accommodate different types of noise generation processes, including adversarial label perturbations, could further enhance its resilience and applicability across diverse fields.

In summary, Meta Label Correction marks a significant advancement in the strategy for learning with noisy labels, paving the way for improved model reliability and accuracy in real-world applications. Its novel utilization of meta-learning for label correction offers a refined approach that addresses the limitations of prior methods, setting a robust foundation for future research and development in noisy label learning.

Related Papers

Find Related Papers