Label Propagation for Deep Semi-supervised Learning (1904.04717v1)

Published 9 Apr 2019 in cs.CV and cs.LG

Abstract: Semi-supervised learning is becoming increasingly important because it can combine data carefully labeled by humans with abundant unlabeled data to train deep neural networks. Classic methods on semi-supervised learning that have focused on transductive learning have not been fully exploited in the inductive framework followed by modern deep learning. The same holds for the manifold assumption---that similar examples should get the same prediction. In this work, we employ a transductive label propagation method that is based on the manifold assumption to make predictions on the entire dataset and use these predictions to generate pseudo-labels for the unlabeled data and train a deep neural network. At the core of the transductive method lies a nearest neighbor graph of the dataset that we create based on the embeddings of the same network.Therefore our learning process iterates between these two steps. We improve performance on several datasets especially in the few labels regime and show that our work is complementary to current state of the art.

Authors (4)

Ahmet Iscen (29 papers)
Giorgos Tolias (43 papers)
Yannis Avrithis (48 papers)
Ondrej Chum (21 papers)

Citations (584)

View on Semantic Scholar

Summary

The paper introduces a transductive approach that combines label propagation with deep network training to generate pseudo-labels for unlabeled data.
It constructs a nearest neighbor graph from network embeddings to diffuse label information, balancing labeled and pseudo-labeled data.
Experiments on CIFAR and Mini-ImageNet datasets show improved performance in low-label scenarios compared to previous semi-supervised methods.

Label Propagation for Deep Semi-supervised Learning

The paper presents a method for deep semi-supervised learning (SSL) that leverages label propagation based on the manifold assumption. The method uses both labeled and unlabeled data in training deep neural networks, addressing the inefficiencies commonly associated with label annotation. The approach is proposed as complementary to existing state-of-the-art semi-supervised methods.

Method Overview

The paper introduces a transductive learning approach using label propagation to infer pseudo-labels for unlabeled data. A nearest neighbor graph constructed from the network's embeddings forms the basis of the method. The process iterates between constructing this graph and updating the network with pseudo-labeled data, balancing labeled and pseudo-labeled examples through a weighted loss function.

Technical Details

Nearest Neighbor Graph: Essential for the proposed method, it uses network embeddings to connect data points based on similarity, forming a sparse affinity matrix that respects manifold structures.
Label Propagation: Pseudo-labels are predicted through a diffusion process, derived from solving a linear system involving the affinity matrix. This approach is shown to outperform network-based pseudo-labeling.
Weighted Loss Function: Training uses a weighted combination of labeled and pseudo-labeled data. The weights depend on pseudo-label certainty (via entropy) and class balancing, improving robustness against noise in pseudo-labels.

Performance and Evaluation

The methodological innovations are evaluated on standard datasets like CIFAR-10, CIFAR-100, and a new semi-supervised setup for Mini-ImageNet. The results indicate improvements, especially when fewer labeled data are available, compared to methods solely based on unsupervised losses, like the Mean Teacher approach. The combination of the proposed method with these unsupervised methods yielded further improvements, emphasizing the complementarity.

Implications

The research implies a promising direction for SSL by efficiently utilizing unlabeled data. The combination of graph-based transductive learning with inductive training of deep networks opens up avenues for reducing reliance on large amounts of labeled data. Particularly significant is the use of transductive methods within an inductive framework, offering new insights into model training.

Future Directions

Potential areas for exploration include further optimizing the graph construction process and extending the method to other domains beyond image classification. Investigating the scalability of the approach to large datasets and its adaptability to different model architectures could deepen understanding of SSL applications.

Overall, this work contributes a nuanced perspective to SSL, offering practical and theoretical advancements that could influence future research trajectories in automated learning systems.

PDF Markdown

Related Papers

YouTube

Show All Videos