Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning (2001.06001v2)

Published 16 Jan 2020 in cs.LG, cs.CV, and stat.ML

Abstract: In this paper we revisit the idea of pseudo-labeling in the context of semi-supervised learning where a learning algorithm has access to a small set of labeled samples and a large set of unlabeled samples. Pseudo-labeling works by applying pseudo-labels to samples in the unlabeled set by using a model trained on the combination of the labeled samples and any previously pseudo-labeled samples, and iteratively repeating this process in a self-training cycle. Current methods seem to have abandoned this approach in favor of consistency regularization methods that train models under a combination of different styles of self-supervised losses on the unlabeled samples and standard supervised losses on the labeled samples. We empirically demonstrate that pseudo-labeling can in fact be competitive with the state-of-the-art, while being more resilient to out-of-distribution samples in the unlabeled set. We identify two key factors that allow pseudo-labeling to achieve such remarkable results (1) applying curriculum learning principles and (2) avoiding concept drift by restarting model parameters before each self-training cycle. We obtain 94.91% accuracy on CIFAR-10 using only 4,000 labeled samples, and 68.87% top-1 accuracy on Imagenet-ILSVRC using only 10% of the labeled samples. The code is available at https://github.com/uvavision/Curriculum-Labeling

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Paola Cascante-Bonilla (17 papers)
  2. Fuwen Tan (10 papers)
  3. Yanjun Qi (68 papers)
  4. Vicente Ordonez (52 papers)
Citations (22)

Summary

Insightful Overview of "Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning"

The paper "Curriculum Labeling: Revisiting Pseudo-Labeling for Semi-Supervised Learning" re-evaluates the concept of pseudo-labeling in the domain of semi-supervised learning (SSL). While pseudo-labeling has received less focus due to the emergence of consistency regularization approaches, the authors provide empirical evidence to demonstrate that pseudo-labeling, when enhanced with curriculum labeling, can offer competitive performance in SSL settings. The paper quantifies these assertions with compelling numerical results, such as achieving 94.91%94.91\% accuracy on CIFAR-10 utilizing merely $4,000$ labeled samples, and 68.87%68.87\% top-1 accuracy on ImageNet ILSVRC with only 10% of labeled data.

Key Methodological Innovations

  1. Curriculum Labeling: The authors present an innovative pseudo-labeling framework termed curriculum labeling, which iteratively and selectively integrates unlabeled data into the training process. During each iteration, the model is trained on labeled data, predicts labels for the unlabeled data, and adds high-confidence predictions to the training dataset in a curriculum-based manner. Their approach adopts principles from curriculum learning, progressively moving from 'easy' to 'hard' samples. This pacing is determined by analyzing the distribution of prediction scores and leveraging Extreme Value Theory (EVT) to set thresholds.
  2. Mitigating Concept Drift: A major concern in pseudo-labeling is concept drift, where the model’s predictions become biased due to the iterative use of the same model for labeling. To tackle this, the authors propose resetting model parameters at the onset of each self-training cycle, which reduces the impact of past errors propagating through iterations.
  3. Row Transformation Techniques and Data Augmentation: Despite the centrality of pseudo-labeling, the paper also explores the benefits of data augmentation techniques like Mixup and Random Augmentation to enhance the learning capacity of the model by providing additional variability.

Experimentation and Results

The experimentation conducted in the paper employs widely-used datasets such as CIFAR-10, SVHN, and ImageNet. The authors report robust evaluations, showcasing that curriculum labeling can achieve results comparable, and in certain benchmarks, superior to state-of-the-art methods such as UDA and FixMatch. Particularly notable is the method's resilience to out-of-distribution samples, an advantageous property inferred from its theoretical underpinnings and practical design.

Theoretical and Practical Implications

Theoretically, the analysis within the paper formalizes the optimization objective of pseudo-labeling through a regularized ERM framework, integrating a Bayesian prior for selecting unlabeled samples. This sheds light on how selective sample addition based on confidence can inform minimization of loss, grounding the method in a principled optimization perspective.

Practically, the methodology suggests that with proper selection criteria, pseudo-labeling can remain a viable and efficient SSL strategy. The results underline that model initialization strategies hold significant leverage in managing confirmation bias, offering a salient example for other pseudo-labeling endeavors.

Speculation on Future Developments

The insights gained from this paper suggest several avenues for advancing the state of SSL. Future work may explore adaptive curriculum labeling strategies that further refine sample selection criteria based on richer feature embeddings or context-oriented thresholds. Moreover, integrating comprehensive data augmentation frameworks and exploring their symbiosis with pseudo-labeling might uncover even greater potentials of this methodological fusion in diverse and challenging domains.

In summary, the paper revisits and revitalizes pseudo-labeling with novel insights and methodical upgrades that not only enhance performance in SSL tasks but also broaden understanding of effective semi-supervised learning techniques. It underscores the necessity of adaptive and dynamically structured learning processes as keystone components in machine learning’s pursuit of leveraging unlabeled data.

Youtube Logo Streamline Icon: https://streamlinehq.com