Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Incremental Self-training for Semi-supervised Learning (2404.12398v1)

Published 14 Apr 2024 in cs.LG

Abstract: Semi-supervised learning provides a solution to reduce the dependency of machine learning on labeled data. As one of the efficient semi-supervised techniques, self-training (ST) has received increasing attention. Several advancements have emerged to address challenges associated with noisy pseudo-labels. Previous works on self-training acknowledge the importance of unlabeled data but have not delved into their efficient utilization, nor have they paid attention to the problem of high time consumption caused by iterative learning. This paper proposes Incremental Self-training (IST) for semi-supervised learning to fill these gaps. Unlike ST, which processes all data indiscriminately, IST processes data in batches and priority assigns pseudo-labels to unlabeled samples with high certainty. Then, it processes the data around the decision boundary after the model is stabilized, enhancing classifier performance. Our IST is simple yet effective and fits existing self-training-based semi-supervised learning methods. We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed. Significantly, it outperforms state-of-the-art competitors on three challenging image classification tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Y. Chen, J. Guo, J. Huang, and B. Lin, “A novel method for financial distress prediction based on sparse neural networks with l 1/2 regularization,” International Journal of Machine Learning and Cybernetics, vol. 13, no. 7, pp. 2089–2103, 2022.
  2. S. Liu, B. Yang, L. Wang, X. Zhao, J. Zhou, and J. Guo, “Prediction of share price trend using fcm neural network classifier,” in 2016 3rd International Conference on Informative and Cybernetics for Computational Social Systems (ICCSS).   IEEE, 2016, pp. 81–86.
  3. J. Guo, C. L. P. Chen, L. Wang, B. Yang, T. Zhang, and L. Zhang, “Constructing microstructural evolution system for cement hydration from observed data using deep learning,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 53, no. 7, pp. 4576–4589, 2023.
  4. L. Hoyer, D. Dai, and L. Van Gool, “Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation,” in CVPR, 2022, pp. 9924–9935.
  5. J. Guo, Z. Liu, C. P. Chen, T. Zhang, L. Wang, and K. Fan, “An efficient inspection system based on broad learning: nondestructively estimating cement compressive strength with internal factors,” IEEE Transactions on Industrial Informatics, vol. 18, no. 6, pp. 3787–3798, 2021.
  6. J. E. Van Engelen and H. H. Hoos, “A survey on semi-supervised learning,” Machine learning, vol. 109, no. 2, pp. 373–440, 2020.
  7. J. Guo, C. P. Chen, Z. Liu, and X. Yang, “Dynamic neural network structure: A review for its theories and applications,” IEEE Transactions on Neural Networks and Learning Systems, 2024.
  8. D. Yarowsky, “Unsupervised word sense disambiguation rivaling supervised methods,” in 33rd annual meeting of the association for computational linguistics, 1995, pp. 189–196.
  9. C. Wei, K. Sohn, C. Mellina, A. Yuille, and F. Yang, “Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning,” in CVPR, June 2021, pp. 10 857–10 866.
  10. L. Yang, L. Qi, L. Feng, W. Zhang, and Y. Shi, “Revisiting weak-to-strong consistency in semi-supervised semantic segmentation,” in CVPR, June 2023, pp. 7236–7246.
  11. B. Chen, J. Jiang, X. Wang, P. Wan, J. Wang, and M. Long, “Debiased self-training for semi-supervised learning,” in NeurIPS, vol. 35, 2022, pp. 32 424–32 437.
  12. K. Sohn, D. Berthelot, N. Carlini, Z. Zhang, H. Zhang, C. A. Raffel, E. D. Cubuk, A. Kurakin, and C.-L. Li, “Fixmatch: Simplifying semi-supervised learning with consistency and confidence,” in NeurIPS, vol. 33, 2020, pp. 596–608.
  13. B. Zhang, Y. Wang, W. Hou, H. Wu, J. Wang, M. Okumura, and T. Shinozaki, “Flexmatch: Boosting semi-supervised learning with curriculum pseudo labeling,” in NeurIPS, 2021, pp. 18 408–18 419.
  14. C. L. P. Chen and Z. Liu, “Broad learning system: An effective and efficient incremental learning system without the need for deep architecture,” IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 1, pp. 10–24, 2018.
  15. Y. Oh, D.-J. Kim, and I. S. Kweon, “Daso: Distribution-aware semantics-oriented pseudo-label for imbalanced semi-supervised learning,” in CVPR, 2022, pp. 9786–9796.
  16. Q. Xie, M.-T. Luong, E. Hovy, and Q. V. Le, “Self-training with noisy student improves imagenet classification,” in CVPR, 2020, pp. 10 687–10 698.
  17. D.-H. Lee et al., “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” in ICML, vol. 3, no. 2.   Atlanta, 2013, p. 896.
  18. D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. A. Raffel, “Mixmatch: A holistic approach to semi-supervised learning,” in NeurIPS, vol. 32, 2019.
  19. Q. Xie, Z. Dai, E. Hovy, T. Luong, and Q. Le, “Unsupervised data augmentation for consistency training,” in NeurIPS, vol. 33, 2020, pp. 6256–6268.
  20. A. Kurakin, C. Raffel, D. Berthelot, E. D. Cubuk, H. Zhang, K. Sohn, and N. Carlini, “Remixmatch: Semi-supervised learning with distribution matching and augmentation anchoring,” in ICLR, 2020.
  21. P. Cascante-Bonilla, F. Tan, Y. Qi, and V. Ordonez, “Curriculum labeling: Revisiting pseudo-labeling for semi-supervised learning.” in National Conference on Artificial Intelligence, 2021.
  22. X. Yi, S. Lei, Y. Jinxing, Q. Qi, L. Yu-Feng, S. Baigui, L. Hao, , and J. Rong, “Dash: Semi-supervised learning with dynamic thresholding,” in ICLR, 2021.
  23. K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in CVPR, 2020, pp. 9729–9738.
  24. T. Chen, S. Kornblith, K. Swersky, M. Norouzi, and G. E. Hinton, “Big self-supervised models are strong semi-supervised learners,” in NeurIPS, vol. 33, 2020, pp. 22 243–22 255.
  25. X. Wang, J. Gao, M. Long, and J. Wang, “Self-tuning for data-efficient deep learning,” in ICML, 2021, pp. 10 738–10 748.
  26. D. Vincent, B. Ishmael, P. Ben, M. Olivier, A. Lamb, A. Martin, and C. Aaron, “Adversarially learned inference,” in ICLR, 2017.
  27. T. Miyato, S.-i. Maeda, M. Koyama, and S. Ishii, “Virtual adversarial training: a regularization method for supervised and semi-supervised learning,” IEEE TPAMI, vol. 41, no. 8, pp. 1979–1993, 2018.
  28. S. Lloyd, “Least squares quantization in pcm,” IEEE transactions on information theory, vol. 28, no. 2, pp. 129–137, 1982.
  29. J. C. Dunn, “A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters,” Journal of Cybernetics, vol. 3, no. 3, pp. 32–57, 1973.
  30. T. Zhang, R. Ramakrishnan, and M. Livny, “Birch: an efficient data clustering method for very large databases,” ACM sigmod record, vol. 25, no. 2, pp. 103–114, 1996.
  31. M. Ester, H.-P. Kriegel, J. Sander, X. Xu et al., “A density-based algorithm for discovering clusters in large spatial databases with noise,” in AAAI, vol. 96, no. 34, 1996, pp. 226–231.
  32. M. Ankerst, M. M. Breunig, H.-P. Kriegel, and J. Sander, “Optics: Ordering points to identify the clustering structure,” ACM Sigmod record, vol. 28, no. 2, pp. 49–60, 1999.
  33. D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE TPAMI, vol. 24, no. 5, pp. 603–619, 2002.
  34. B. J. Frey and D. Dueck, “Clustering by passing messages between data points,” science, vol. 315, no. 5814, pp. 972–976, 2007.
  35. U. Von Luxburg, “A tutorial on spectral clustering,” Statistics and computing, vol. 17, pp. 395–416, 2007.
  36. D. Sculley, “Web-scale k-means clustering,” in Proceedings of the 19th international conference on World wide web, 2010, pp. 1177–1178.
  37. L. Yang, W. Zhuo, L. Qi, Y. Shi, and Y. Gao, “St++: Make self-training work better for semi-supervised semantic segmentation,” in CVPR, June 2022, pp. 4268–4277.
  38. S. Zagoruyko and N. Komodakis, “Wide residual networks,” in BMVC, 2016.
  39. E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, “Randaugment: Practical automated data augmentation with a reduced search space,” in CVPR, June 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jifeng Guo (5 papers)
  2. Zhulin Liu (4 papers)
  3. Tong Zhang (569 papers)
  4. C. L. Philip Chen (49 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com