- The paper introduces a transductive approach that combines label propagation with deep network training to generate pseudo-labels for unlabeled data.
- It constructs a nearest neighbor graph from network embeddings to diffuse label information, balancing labeled and pseudo-labeled data.
- Experiments on CIFAR and Mini-ImageNet datasets show improved performance in low-label scenarios compared to previous semi-supervised methods.
Label Propagation for Deep Semi-supervised Learning
The paper presents a method for deep semi-supervised learning (SSL) that leverages label propagation based on the manifold assumption. The method uses both labeled and unlabeled data in training deep neural networks, addressing the inefficiencies commonly associated with label annotation. The approach is proposed as complementary to existing state-of-the-art semi-supervised methods.
Method Overview
The paper introduces a transductive learning approach using label propagation to infer pseudo-labels for unlabeled data. A nearest neighbor graph constructed from the network's embeddings forms the basis of the method. The process iterates between constructing this graph and updating the network with pseudo-labeled data, balancing labeled and pseudo-labeled examples through a weighted loss function.
Technical Details
- Nearest Neighbor Graph: Essential for the proposed method, it uses network embeddings to connect data points based on similarity, forming a sparse affinity matrix that respects manifold structures.
- Label Propagation: Pseudo-labels are predicted through a diffusion process, derived from solving a linear system involving the affinity matrix. This approach is shown to outperform network-based pseudo-labeling.
- Weighted Loss Function: Training uses a weighted combination of labeled and pseudo-labeled data. The weights depend on pseudo-label certainty (via entropy) and class balancing, improving robustness against noise in pseudo-labels.
Performance and Evaluation
The methodological innovations are evaluated on standard datasets like CIFAR-10, CIFAR-100, and a new semi-supervised setup for Mini-ImageNet. The results indicate improvements, especially when fewer labeled data are available, compared to methods solely based on unsupervised losses, like the Mean Teacher approach. The combination of the proposed method with these unsupervised methods yielded further improvements, emphasizing the complementarity.
Implications
The research implies a promising direction for SSL by efficiently utilizing unlabeled data. The combination of graph-based transductive learning with inductive training of deep networks opens up avenues for reducing reliance on large amounts of labeled data. Particularly significant is the use of transductive methods within an inductive framework, offering new insights into model training.
Future Directions
Potential areas for exploration include further optimizing the graph construction process and extending the method to other domains beyond image classification. Investigating the scalability of the approach to large datasets and its adaptability to different model architectures could deepen understanding of SSL applications.
Overall, this work contributes a nuanced perspective to SSL, offering practical and theoretical advancements that could influence future research trajectories in automated learning systems.