Bridging the Gap: Learning Pace Synchronization for Open-World Semi-Supervised Learning (2309.11930v2)
Abstract: In open-world semi-supervised learning, a machine learning model is tasked with uncovering novel categories from unlabeled data while maintaining performance on seen categories from labeled data. The central challenge is the substantial learning gap between seen and novel categories, as the model learns the former faster due to accurate supervisory information. Moreover, capturing the semantics of unlabeled novel category samples is also challenging due to the missing label information. To address the above issues, we introduce 1) the adaptive synchronizing marginal loss which imposes class-specific negative margins to alleviate the model bias towards seen classes, and 2) the pseudo-label contrastive clustering which exploits pseudo-labels predicted by the model to group unlabeled data from the same category together in the output space. Extensive experiments on benchmark datasets demonstrate that previous approaches may significantly hinder novel class learning, whereas our method strikingly balances the learning pace between seen and novel classes, achieving a remarkable 3% average accuracy increase on the ImageNet dataset. Importantly, we find that fine-tuning the self-supervised pre-trained model significantly boosts the performance, which is overlooked in prior literature. Our code is available at https://github.com/yebo0216best/LPS-main.
- Pseudo-labeling and confirmation bias in deep semi-supervised learning. In International Joint Conference on Neural Networks, 2020.
- Towards open world recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2015.
- MixMatch: A holistic approach to semi-supervised learning. Neural Information Processing Systems, 2019.
- Learning and the unknown: Surveying steps toward open world recognition. In AAAI Conference on Artificial Intelligence, 2019.
- Learning imbalanced datasets with label-distribution-aware margin loss. In Neural Information Processing Systems, 2020.
- Open-world semi-supervised learning. In International Conference on Learning Representations, 2022.
- A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, 2020.
- Semi-supervised learning under class distribution mismatch. In AAAI Conference on Artificial Intelligence, 2020.
- Randaugment: Practical automated data augmentation with a reduced search space. arXiv preprint arXiv:1909.13719, 2019.
- Safe deep semi-supervised learning for unseen-class unlabeled data. In International Conference on Machine Learning, 2020.
- Robust semi-supervised learning when not all classes have labels. In Neural Information Processing Systems, 2022.
- Deep ranking with adaptive margin triplet loss. arXiv preprint arXiv:2107.06187, 2021.
- Learning to discover novel visual categories via deep transfer clustering. In IEEE International Conference on Computer Vision, 2019.
- Automatically discovering and learning new visual categories with ranking statistics. In International Conference on Learning Representations, 2020.
- Alex Krizhevsky. Learning multiple layers of features from tiny images. Master’s thesis, 2009.
- Harold W Kuhn. The Hungarian method for the assignment problem. Naval Research Logistics Quarterly, 1955.
- Temporal Ensembling for Semi-Supervised Learning. arXiv: Neural and Evolutionary Computing, 2016.
- Dong-Hyun Lee. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, 2013.
- Boosting few-shot learning with adaptive margin loss. In IEEE Conference on Computer Vision and Pattern Recognition, 2020.
- Open-world semi-supervised novel class discovery. In International Joint Conference on Artificial Intelligence, 2023.
- Realistic evaluation of deep semi-supervised learning algorithms. In Neural Information Processing Systems, 2018.
- ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015.
- Openmatch: Open-set semi-supervised learning with open-set consistency regularization. Neural Information Processing Systems, 2021.
- Regularization with stochastic transformations and perturbations for deep semi-supervised learning. In Neural Information Processing Systems, 2016.
- Semi-supervised action recognition with temporal contrastive learning. In IEEE Conference on Computer Vision and Pattern Recognition, 2021.
- FixMatch: Simplifying semi-supervised learning with consistency and confidence. In Neural Information Processing Systems, 2020.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 2008.
- Generalized category discovery. In IEEE Conference on Computer Vision and Pattern Recognition, 2022.
- Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, 2020.
- Towards realistic long-tailed semi-supervised learning: Consistency is all you need. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- Transfer and share: Semi-supervised learning from long-tailed data. Machine Learning, 2022.
- Dash: Semi-supervised learning with dynamic thresholding. In International Conference on Machine Learning, 2021.
- Neighborhood contrastive learning for novel class discovery. In IEEE conference on Computer Vision and Pattern Recognition, 2021.
- Openmix: Reviving known knowledge for discovering novel visual categories in an open world. In IEEE conference on Computer Vision and Pattern Recognition, 2021.
- Introduction to semi-supervised learning. 2009.