Contrastive Credibility Propagation for Reliable Semi-Supervised Learning (2211.09929v4)
Abstract: Producing labels for unlabeled data is error-prone, making semi-supervised learning (SSL) troublesome. Often, little is known about when and why an algorithm fails to outperform a supervised baseline. Using benchmark datasets, we craft five common real-world SSL data scenarios: few-label, open-set, noisy-label, and class distribution imbalance/misalignment in the labeled and unlabeled sets. We propose a novel algorithm called Contrastive Credibility Propagation (CCP) for deep SSL via iterative transductive pseudo-label refinement. CCP unifies semi-supervised learning and noisy label learning for the goal of reliably outperforming a supervised baseline in any data scenario. Compared to prior methods which focus on a subset of scenarios, CCP uniquely outperforms the supervised baseline in all scenarios, supporting practitioners when the qualities of labeled or unlabeled data are unknown.
- Self-Training: A Survey.
- ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring. CoRR, abs/1911.09785.
- Text Transformations in Contrastive Self-Supervised Learning: A Review.
- Universal Sentence Encoder.
- Importance of Semantic Representation: Dataless Classification. In Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 2, AAAI’08, 830–835. AAAI Press. ISBN 9781577353683.
- Semi-Supervised Learning. The MIT Press. ISBN 9780262033589.
- Debiased Self-Training for Semi-Supervised Learning.
- Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise. In AAAI.
- A Simple Framework for Contrastive Learning of Visual Representations. In III, H. D.; and Singh, A., eds., Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, 1597–1607. PMLR.
- Big Self-Supervised Models are Strong Semi-Supervised Learners.
- Csiszar, I. 1975. I𝐼Iitalic_I-Divergence Geometry of Probability Distributions and Minimization Problems. The Annals of Probability, 3(1): 146 – 158.
- Ensemble-style Self-training on Citation Classification. In Proceedings of 5th International Joint Conference on Natural Language Processing, 623–631. Chiang Mai, Thailand: Asian Federation of Natural Language Processing.
- Label propagation algorithm based on Roll-back detection and credibility assessment. Mathematical Biosciences and Engineering, 17(3): 2432–2450.
- Dwork, C. 2011. The Promise of Differential Privacy: A Tutorial on Algorithmic Techniques. In 2011 IEEE 52nd Annual Symposium on Foundations of Computer Science, 1–2.
- BPEmb: Tokenization-free Pre-trained Subword Embeddings in 275 Languages. In chair), N. C. C.; Choukri, K.; Cieri, C.; Declerck, T.; Goggi, S.; Hasida, K.; Isahara, H.; Maegaard, B.; Mariani, J.; Mazo, H.; Moreno, A.; Odijk, J.; Piperidis, S.; and Tokunaga, T., eds., Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). ISBN 979-10-95546-00-9.
- Gaussian Error Linear Units (GELUs).
- Label propagation for deep semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5070–5079.
- Semi-supervised Convolutional Neural Networks for Text Categorization via Region Embedding. In Cortes, C.; Lawrence, N. D.; Lee, D. D.; Sugiyama, M.; and Garnett, R., eds., Advances in Neural Information Processing Systems 28, 919–927. Curran Associates, Inc.
- Semi-Supervised Learning via Compact Latent Space Clustering. In Dy, J.; and Krause, A., eds., Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, 2459–2468. PMLR.
- Supervised Contrastive Learning.
- Kim, Y. 2014. Convolutional Neural Networks for Sentence Classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL, 1746–1751.
- Adam: A Method for Stochastic Optimization. CoRR, abs/1412.6980.
- Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto.
- Lee, D.-H.; et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, volume 3, 896.
- DBpedia - A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web Journal, 6.
- CoMatch: Semi-supervised Learning with Contrastive Graph Regularization. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 9455–9464.
- Realistic Evaluation of Deep Semi-Supervised Learning Algorithms. In Bengio, S.; Wallach, H.; Larochelle, H.; Grauman, K.; Cesa-Bianchi, N.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.
- Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), 115–124. Ann Arbor, Michigan: Association for Computational Linguistics.
- Better Self-training for Image Classification through Self-supervision.
- OpenMatch: Open-Set Semi-supervised Learning with Open-set Consistency Regularization. In Ranzato, M.; Beygelzimer, A.; Dauphin, Y.; Liang, P.; and Vaughan, J. W., eds., Advances in Neural Information Processing Systems, volume 34, 25956–25967. Curran Associates, Inc.
- Transductive semi-supervised deep learning using min-max features. In Proceedings of the European Conference on Computer Vision (ECCV), 299–315.
- FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. In Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M.; and Lin, H., eds., Advances in Neural Information Processing Systems, volume 33, 596–608. Curran Associates, Inc.
- R2-D2: Repetitive Reprediction Deep Decipher for Semi-Supervised Deep Learning. arXiv preprint arXiv:2202.08955.
- Towards Realistic Long-Tailed Semi-Supervised Learning: Consistency Is All You Need. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 3469–3478.
- Unsupervised Data Augmentation for Consistency Training. arXiv:1904.12848.
- Self-Training With Noisy Student Improves ImageNet Classification. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10684–10695.
- A Survey on Deep Semi-supervised Learning. CoRR, abs/2103.00550.
- Wide Residual Networks. CoRR, abs/1605.07146.
- Character-level Convolutional Networks for Text Classification. CoRR, abs/1509.01626.
- Semi-supervised Contrastive Learning with Similarity Co-calibration. ArXiv, abs/2105.07387.
- SimMatch: Semi-supervised Learning with Similarity Matching. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 14451–14461. Los Alamitos, CA, USA: IEEE Computer Society.
- Learning from Labeled and Unlabeled Data with Label Propagation.
- Brody Kutt (1 paper)
- Pralay Ramteke (1 paper)
- Xavier Mignot (1 paper)
- Pamela Toman (1 paper)
- Nandini Ramanan (7 papers)
- Sujit Rokka Chhetri (8 papers)
- Shan Huang (69 papers)
- Min Du (46 papers)
- William Hewlett (2 papers)