Papers
Topics
Authors
Recent
Search
2000 character limit reached

VCC-INFUSE: Towards Accurate and Efficient Selection of Unlabeled Examples in Semi-supervised Learning

Published 18 Apr 2024 in cs.LG and cs.CV | (2404.11947v2)

Abstract: Despite the progress of Semi-supervised Learning (SSL), existing methods fail to utilize unlabeled data effectively and efficiently. Many pseudo-label-based methods select unlabeled examples based on inaccurate confidence scores from the classifier. Most prior work also uses all available unlabeled data without pruning, making it difficult to handle large amounts of unlabeled data. To address these issues, we propose two methods: Variational Confidence Calibration (VCC) and Influence-Function-based Unlabeled Sample Elimination (INFUSE). VCC is an universal plugin for SSL confidence calibration, using a variational autoencoder to select more accurate pseudo labels based on three types of consistency scores. INFUSE is a data pruning method that constructs a core dataset of unlabeled examples under SSL. Our methods are effective in multiple datasets and settings, reducing classification errors rates and saving training time. Together, VCC-INFUSE reduces the error rate of FlexMatch on the CIFAR-100 dataset by 1.08% while saving nearly half of the training time.

Authors (3)
Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. MixMatch: A Holistic Approach to Semi-Supervised Learning. In Advances in Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2019.
  2. ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring. In International Conference on Learning Representations, Addis Ababa, Ethiopia, 2020.
  3. Softmatch: Addressing the quantity-quality tradeoff in semi-supervised learning. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023, 2023.
  4. An Analysis of Single-Layer Networks in Unsupervised Feature Learning. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, USA, 2011.
  5. ImageNet: A Large-Scale Hierarchical Image Database. In Conference on Computer Vision and Pattern Recognition, Miami, Florida, USA, 2009.
  6. Bacon: Boosting imbalanced semi-supervised learning via balanced feature-level contrastive learning. In Thirty-Eighth AAAI Conference on Artificial Intelligence, AAAI 2024, 2024.
  7. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. In Proceedings of the 33nd International Conference on Machine Learning, New York City, NY, USA, 2016.
  8. Semi-supervised Learning Objectives as Log-likelihoods in a Generative Model of Data Curation. arXiv preprint arXiv:2008.05913, 2020.
  9. Class-Imbalanced Semi-Supervised Learning with Adaptive Thresholding. In International Conference on Machine Learning, Baltimore, Maryland, USA, 2022.
  10. On Calibration of Modern Neural Networks. In International Conference on Machine Learning, Sydney, NSW, Australia, 2017.
  11. GRAD-MATCH: Gradient Matching based Data Subset Selection for Efficient Deep Model Training. In Proceedings of the International Conference on Machine Learning, Virtual Event, 2021.
  12. RETRIEVE: Coreset Selection for Efficient and Robust Semi-Supervised Learning. In Advances in Neural Information Processing Systems, virtual, 2021.
  13. Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised Learning. In Advances in Neural Information Processing Systems, virtual, 2020.
  14. Auto-Encoding Variational Bayes. In International Conference on Learning Representations, Banff, AB, Canada, 2014.
  15. Understanding black-box predictions via influence functions. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, 2017.
  16. Learning Multiple Layers of Features from Tiny Images. In Doctoral dissertation, University of Toronto, 2009.
  17. Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings. In International Conference on Machine Learning, Stockholm, Sweden, 2018.
  18. Temporal Ensembling for Semi-Supervised Learning. arXiv preprint arXiv:1610.02242, 2016.
  19. Dong-Hyun Lee et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, 2013.
  20. ABC: Auxiliary balanced classifier for class-imbalanced semi-supervised learning. In Advances in Neural Information Processing Systems, 2021.
  21. Comatch: Semi-supervised learning with contrastive graph regularization. In IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, October 10-17, 2021.
  22. Microsoft COCO: Common Objects in Context. In European Conference on Computer Vision, Zurich, Switzerland, 2014.
  23. Scalable Gradient-Based Tuning of Continuous Regularization Hyperparameters. In International Conference on Machine Learning, New York City, NY, USA, 2016.
  24. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8):1979–1993, 2019.
  25. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.
  26. Measuring Calibration in Deep Learning. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 2019.
  27. DASO: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced Semi-Supervised Learning. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 2022.
  28. Deep Learning on a Data Diet: Finding Important Examples Early in Training. In Advances in Neural Information Processing Systems, virtual, 2021.
  29. Meta Pseudo Labels. In IEEE Conference on Computer Vision and Pattern Recognition, virtual, 2021.
  30. FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. In Advances in Neural Information Processing Systems, Virtual, 2020.
  31. Mean Teachers are Better Role Models: Weight-Averaged Consistency Targets Improve Semi-Supervised Deep Learning Results. In Advances in Neural Information Processing Systems, Toulon, France, 2017.
  32. Debiased Learning from Naturally Imbalanced Pseudo-Labels. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 2022.
  33. Unsupervised Data Augmentation for Consistency Training. In Advances in Neural Information Processing Systems, Virtual, 2020.
  34. Multi-View Learning With Incomplete Views. IEEE Trans. Image Process., 24(12):5812–5825, 2015.
  35. Dash: Semi-Supervised Learning with Dynamic Thresholding. In Proceedings of the International Conference on Machine Learning, Virtual Event, 2021.
  36. Mix-N-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning. In International Conference on Machine Learning, Virtual, 2020.
  37. FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling. In Advances in Neural Information Processing, Virtual, 2021.
  38. SimMatch: Semi-supervised Learning with Similarity Matching. In Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 2022.
  39. Time-Consistent Self-Supervision for Semi-Supervised Learning. In International Conference on Machine Learning, Virtual, 2020.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.