Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dynamic Sub-graph Distillation for Robust Semi-supervised Continual Learning (2312.16409v1)

Published 27 Dec 2023 in cs.LG and cs.CV

Abstract: Continual learning (CL) has shown promising results and comparable performance to learning at once in a fully supervised manner. However, CL strategies typically require a large number of labeled samples, making their real-life deployment challenging. In this work, we focus on semi-supervised continual learning (SSCL), where the model progressively learns from partially labeled data with unknown categories. We provide a comprehensive analysis of SSCL and demonstrate that unreliable distributions of unlabeled data lead to unstable training and refinement of the progressing stages. This problem severely impacts the performance of SSCL. To address the limitations, we propose a novel approach called Dynamic Sub-Graph Distillation (DSGD) for semi-supervised continual learning, which leverages both semantic and structural information to achieve more stable knowledge distillation on unlabeled data and exhibit robustness against distribution bias. Firstly, we formalize a general model of structural distillation and design a dynamic graph construction for the continual learning progress. Next, we define a structure distillation vector and design a dynamic sub-graph distillation algorithm, which enables end-to-end training and adaptability to scale up tasks. The entire proposed method is adaptable to various CL methods and supervision settings. Finally, experiments conducted on three datasets CIFAR10, CIFAR100, and ImageNet-100, with varying supervision ratios, demonstrate the effectiveness of our proposed approach in mitigating the catastrophic forgetting problem in semi-supervised continual learning scenarios.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Pseudo-recursal: Solving the catastrophic forgetting problem in deep neural networks. arXiv preprint arXiv:1802.03875.
  2. Semi-supervised deep continuous learning. In Proceedings of the 2017 International Conference on Deep Learning Technologies, 11–18.
  3. Continual semi-supervised learning through contrastive interpolation consistency. Pattern Recognition Letters, 162: 9–14.
  4. Hypernetworks for continual semi-supervised learning. arXiv preprint arXiv:2110.01856.
  5. Debiased self-training for semi-supervised learning. Advances in Neural Information Processing Systems, 35: 32424–32437.
  6. Softmatch: Addressing the quantity-quality trade-off in semi-supervised learning. arXiv preprint arXiv:2301.10921.
  7. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 113–123.
  8. A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7): 3366–3385.
  9. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, 248–255. Ieee.
  10. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.
  11. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13): 3521–3526.
  12. Learning multiple layers of features from tiny images.
  13. Lee, D.-H.; et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, volume 3, 896.
  14. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12): 2935–2947.
  15. Gradient Episodic Memory for Continual Learning. In Guyon, I.; Luxburg, U. V.; Bengio, S.; Wallach, H.; Fergus, R.; Vishwanathan, S.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc.
  16. Learning to Predict Gradients for Semi-Supervised Continual Learning. arXiv preprint arXiv:2201.09196.
  17. Virtual adversarial training: a regularization method for supervised and semi-supervised learning. IEEE Transactions on Pattern Analysis and Nachine Intelligence, 41(8): 1979–1993.
  18. Realistic evaluation of deep semi-supervised learning algorithms. Advances in Neural Information Processing Systems, 31.
  19. iCaRL: Incremental classifier and representation learning. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017-January: 5533–5542.
  20. Experience Replay for Continual Learning. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alché-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc.
  21. Memory-efficient semi-supervised continual learning: The world is its own replay buffer. In 2021 International Joint Conference on Neural Networks (IJCNN), 1–8. IEEE.
  22. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in Neural Information Processing Systems, 33: 596–608.
  23. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in Neural Information Processing Systems, 30.
  24. FOSTER: Feature Boosting and Compression for Class-Incremental Learning. arXiv preprint arXiv:2204.04662.
  25. Ordisco: Effective and efficient usage of incremental unlabeled data for semi-supervised continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5383–5392.
  26. Freematch: Self-adaptive thresholding for semi-supervised learning. arXiv preprint arXiv:2205.07246.
  27. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10687–10698.
  28. Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3014–3023.
  29. Efficient and high-quality seeded graph matching: Employing higher-order structural information. ACM Transactions on Knowledge Discovery from Data (TKDD), 15(3): 1–31.
  30. Deep Class-Incremental Learning: A Survey. arXiv preprint arXiv:2302.03648.
  31. A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning. In ICLR.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yan Fan (12 papers)
  2. Yu Wang (939 papers)
  3. Pengfei Zhu (76 papers)
  4. Qinghua Hu (83 papers)
Citations (1)