Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Twice Class Bias Correction for Imbalanced Semi-Supervised Learning (2312.16604v1)

Published 27 Dec 2023 in cs.LG

Abstract: Differing from traditional semi-supervised learning, class-imbalanced semi-supervised learning presents two distinct challenges: (1) The imbalanced distribution of training samples leads to model bias towards certain classes, and (2) the distribution of unlabeled samples is unknown and potentially distinct from that of labeled samples, which further contributes to class bias in the pseudo-labels during training. To address these dual challenges, we introduce a novel approach called \textbf{T}wice \textbf{C}lass \textbf{B}ias \textbf{C}orrection (\textbf{TCBC}). We begin by utilizing an estimate of the class distribution from the participating training samples to correct the model, enabling it to learn the posterior probabilities of samples under a class-balanced prior. This correction serves to alleviate the inherent class bias of the model. Building upon this foundation, we further estimate the class bias of the current model parameters during the training process. We apply a secondary correction to the model's pseudo-labels for unlabeled samples, aiming to make the assignment of pseudo-labels across different classes of unlabeled samples as equitable as possible. Through extensive experimentation on CIFAR10/100-LT, STL10-LT, and the sizable long-tailed dataset SUN397, we provide conclusive evidence that our proposed TCBC method reliably enhances the performance of class-imbalanced semi-supervised learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Mixmatch: A holistic approach to semi-supervised learning. Advances in neural information processing systems, 32.
  2. A systematic study of the class imbalance problem in convolutional neural networks. Neural networks, 106: 249–259.
  3. What is the effect of importance weighting in deep learning? In International conference on machine learning, 872–881. PMLR.
  4. Learning imbalanced datasets with label-distribution-aware margin loss. Advances in neural information processing systems, 32.
  5. Feature space augmentation for long-tailed data. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXIX 16, 694–710. Springer.
  6. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, 215–223. JMLR Workshop and Conference Proceedings.
  7. Class-balanced loss based on effective number of samples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9268–9277.
  8. C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling. In Workshop on learning from imbalanced datasets II, volume 11, 1–8.
  9. Class-imbalanced semi-supervised learning with adaptive thresholding. In International Conference on Machine Learning, 8082–8094. PMLR.
  10. Long-tailed partial label learning via dynamic rebalancing. arXiv preprint arXiv:2302.05080.
  11. Learning deep representation for imbalanced classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, 5375–5384.
  12. Distribution aligning refinery of pseudo-label for imbalanced semi-supervised learning. Advances in neural information processing systems, 33: 14567–14579.
  13. Learning multiple layers of features from tiny images.
  14. Lee, D.-H.; et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, volume 3, 896.
  15. Abc: Auxiliary balanced classifier for class-imbalanced semi-supervised learning. Advances in Neural Information Processing Systems, 34: 7082–7094.
  16. Aligning model outputs for class imbalanced non-IID federated learning. Machine Learning, 1–24.
  17. Safe semi-supervised learning: a brief introduction. Frontiers of Computer Science, 13: 669–676.
  18. Long-tail learning via logit adjustment. arXiv preprint arXiv:2007.07314.
  19. Daso: Distribution-aware semantics-oriented pseudo-label for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9786–9796.
  20. Dynamic sampling in convolutional neural networks for imbalanced data classification. In 2018 IEEE conference on multimedia information processing and retrieval (MIPR), 112–117. IEEE.
  21. Balanced meta-softmax for long-tailed visual recognition. Advances in neural information processing systems, 33: 4175–4186.
  22. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems, 33: 596–608.
  23. CLAF: Contrastive Learning with Augmented Features for Imbalanced Semi-Supervised Learning.
  24. Visualizing data using t-SNE. Journal of machine learning research, 9(11).
  25. A survey on semi-supervised learning. Machine learning, 109(2): 373–440.
  26. Imbalanced Semi-supervised Learning with Bias Adaptive Classifier. In The Eleventh International Conference on Learning Representations.
  27. Learning to model the tail. Advances in neural information processing systems, 30.
  28. Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 10857–10866.
  29. Towards Realistic Long-Tailed Semi-Supervised Learning: Consistency Is All You Need. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3469–3478.
  30. Transfer and share: semi-supervised learning from long-tailed data. Machine Learning, 1–18.
  31. Semi-supervised multi-modal multi-instance multi-label deep network with optimal transport. IEEE Transactions on Knowledge and Data Engineering, 33(2): 696–709.
  32. Comprehensive semi-supervised multi-modal learning. In Proceedings of the 28th International Joint Conference on Artificial Intelligence, 4092–4098.
  33. Identifying and compensating for feature deviation in imbalanced deep learning. arXiv preprint arXiv:2001.01385.
  34. Procrustean training for imbalanced deep learning. In Proceedings of the IEEE/CVF international conference on computer vision, 92–102.
  35. Feature transfer learning for face recognition with under-represented data. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 5704–5713.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Lan Li (26 papers)
  2. Bowen Tao (3 papers)
  3. Lu Han (38 papers)
  4. De-Chuan Zhan (90 papers)
  5. Han-Jia Ye (74 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.