Learning to Self-Train for Semi-Supervised Few-Shot Classification (1906.00562v2)

Published 3 Jun 2019 in cs.CV, cs.LG, and stat.ML

Abstract: Few-shot classification (FSC) is challenging due to the scarcity of labeled training data (e.g. only one labeled data point per class). Meta-learning has shown to achieve promising results by learning to initialize a classification model for FSC. In this paper we propose a novel semi-supervised meta-learning method called learning to self-train (LST) that leverages unlabeled data and specifically meta-learns how to cherry-pick and label such unsupervised data to further improve performance. To this end, we train the LST model through a large number of semi-supervised few-shot tasks. On each task, we train a few-shot model to predict pseudo labels for unlabeled data, and then iterate the self-training steps on labeled and pseudo-labeled data with each step followed by fine-tuning. We additionally learn a soft weighting network (SWN) to optimize the self-training weights of pseudo labels so that better ones can contribute more to gradient descent optimization. We evaluate our LST method on two ImageNet benchmarks for semi-supervised few-shot classification and achieve large improvements over the state-of-the-art method. Code is at https://github.com/xinzheli1217/learning-to-self-train.

PDF Abstract

A Formal Overview of "Learning to Self-Train for Semi-Supervised Few-Shot Classification"

The paper "Learning to Self-Train for Semi-Supervised Few-Shot Classification" introduces an advanced semi-supervised meta-learning framework designed to enhance the efficacy of few-shot classification tasks using scarce labeled data. This research is situated in the challenging domain of Few-Shot Classification (FSC), which is further constrained by the limited availability of labeled instances for model training. The proposed method, termed Learning to Self-Train (LST), effectively integrates unlabeled data into the few-shot learning paradigm, thereby optimizing the task adaptation process. This is achieved by leveraging a meta-learning strategy that self-trains models to judiciously select and pseudo-label unsupervised data.

Methodological Contributions

Self-Training via Meta-Learning: The LST framework employes a meta-learning approach that trains a model on numerous semi-supervised few-shot tasks. These tasks facilitate the learning process by teaching the system to predict pseudo labels for the unlabeled data and iteratively self-train using these pseudo-labeled datasets. A key innovation here is the incorporation of a meta-learned self-training model into the gradient descent paradigm, which helps navigate the problem of label noise typical in self-training scenarios.
Soft Weighting Network (SWN): A novel Soft Weighting Network (SWN) is introduced for optimizing the weights of pseudo-labeled data. This network is designed to refine the self-training process by empowering high-quality pseudo labels to exert more influence during gradient descent optimization, thereby mitigating the negative impact of noisy labels.
Iterative Fine-tuning: The method includes an iterative fine-tuning mechanism where models are fine-tuned using both labeled and pseudo-labeled data, followed by a refinement phase utilizing only labeled data. This iterative process prevents model drift and enhances the classification accuracy by continually adjusting to the semi-supervised context.

Empirical Evaluation

The efficacy of the LST approach is demonstrated through extensive experiments conducted on the miniImageNet and tieredImageNet benchmarks. These datasets were chosen due to their prevalent usage and representativeness of the few-shot learning setting.

Performance Improvement:

The introduction of the LST framework resulted in substantial improvements in classification accuracy over state-of-the-art FSC and semi-supervised few-shot classification (SSFSC) methods. Specifically, LST achieves an accuracy of 70.1% for 1-shot and 78.7% for 5-shot on miniImageNet, notable improvements compared to leading methods in the field.

Robustness Against Distractors:

The approach exhibits a degree of robustness against context-based noise (distractors), maintaining competitive performance even when additional distracting classes are introduced into the unlabeled data context. The recursive self-training mechanism is highlighted as particularly effective in leveraging unlabeled datasets while curbing noise propagation.

Implications and Future Directions

The proposed LST framework's implications span both theoretical advancements in meta-learning methodologies and practical enhancements in machine learning systems tasked with semi-supervised learning scenarios. The intersection of self-training dynamics with meta-learning presents a fertile ground for further exploration, particularly in the context of dynamically adjusting to label uncertainty.

Future research is envisioned to refine the balance between labeled and pseudo-labeled data usage, potentially exploring more sophisticated networks that integrate domain adaptation capabilities, thus further enhancing adaptability and performance in varied semi-supervised environments. There exists significant potential in extending these methodologies to other challenging domains, including areas with imbalanced or highly diverse data distributions.

In sum, the research presents a significant step forward in the direction of integrating unlabeled data into few-shot learning frameworks through the lens of meta-learning, thus enhancing the capacity and flexibility of learning systems in data-limited settings.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Xinzhe Li (14 papers)
Qianru Sun (65 papers)
Yaoyao Liu (19 papers)
Shibao Zheng (21 papers)
Qin Zhou (40 papers)
Tat-Seng Chua (359 papers)
Bernt Schiele (210 papers)

Citations (259)

View on Semantic Scholar

Learning to Self-Train for Semi-Supervised Few-Shot Classification (1906.00562v2)

A Formal Overview of "Learning to Self-Train for Semi-Supervised Few-Shot Classification"

Methodological Contributions

Empirical Evaluation

Implications and Future Directions

Related Papers