Improving Plasticity in Online Continual Learning via Collaborative Learning (2312.00600v2)
Abstract: Online Continual Learning (CL) solves the problem of learning the ever-emerging new classification tasks from a continuous data stream. Unlike its offline counterpart, in online CL, the training data can only be seen once. Most existing online CL research regards catastrophic forgetting (i.e., model stability) as almost the only challenge. In this paper, we argue that the model's capability to acquire new knowledge (i.e., model plasticity) is another challenge in online CL. While replay-based strategies have been shown to be effective in alleviating catastrophic forgetting, there is a notable gap in research attention toward improving model plasticity. To this end, we propose Collaborative Continual Learning (CCL), a collaborative learning based strategy to improve the model's capability in acquiring new concepts. Additionally, we introduce Distillation Chain (DC), a collaborative learning scheme to boost the training of the models. We adapt CCL-DC to existing representative online CL works. Extensive experiments demonstrate that even if the learners are well-trained with state-of-the-art online CL methods, our strategy can still improve model plasticity dramatically, and thereby improve the overall performance by a large margin. The source code of our work is available at https://github.com/maorong-wang/CCL-DC.
- Expert gate: Lifelong learning with a network of experts. In CVPR, 2017.
- Memory aware synapses: Learning what (not) to forget. In ECCV, 2018.
- Online continual learning with maximal interfered retrieval. In NeurIPS. 2019a.
- Gradient based sample selection for online continual learning. In NeurIPS, 2019b.
- Large scale distributed neural network training through online distillation. arXiv preprint arXiv:1804.03235, 2018.
- Dark experience for general continual learning: a strong, simple baseline. In NeurIPS, 2020.
- New insights on reducing abrupt representation change in online continual learning. arXiv preprint arXiv:2104.05025, 2021.
- Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In WACV, 2018.
- Riemannian walk for incremental learning: Understanding forgetting and intransigence. In ECCV, 2018.
- On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019.
- Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 12(3):1–207, 2018.
- Yubei Chen Chun-Hsiao Yeh. IN100pytorch: Pytorch implementation: Training resnets on imagenet-100. https://github.com/danielchyeh/ImageNet-100-Pytorch, 2022.
- Randaugment: Practical automated data augmentation with a reduced search space. In CVPR-W, 2020.
- A continual learning survey: Defying forgetting in classification tasks. IEEE Transactions on Pattern Analysis and Machine Lntelligence, 44(7):3366–3385, 2021.
- Episodic memory in lifelong language learning. In NeurIPS, 2019.
- Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
- Pathnet: Evolution channels gradient descent in super neural networks. arXiv preprint arXiv:1701.08734, 2017.
- Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11):665–673, 2020.
- An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211, 2013.
- Online knowledge distillation via collaborative learning. In CVPR, 2020.
- Online continual learning through mutual information maximization. In ICML, 2022.
- Dealing with cross-task class discrimination in online continual learning. In CVPR, 2023.
- Deep residual learning for image recognition. In CVPR, 2016.
- Learning a unified classifier incrementally via rebalancing. In CVPR, 2019.
- Re-evaluating continual learning scenarios: A categorization and case for strong baselines. arXiv preprint arXiv:1810.12488, 2018.
- Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, 2017.
- Learning multiple layers of features from tiny images. 2009.
- Tiny imagenet visual recognition challenge. CS 231N, 7(7):3, 2015.
- Overcoming catastrophic forgetting by incremental moment matching. In NeurIPS, 2017.
- Gradient episodic memory for continual learning. In NeurIPS, 2017.
- Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning. In CVPR, 2021.
- Class-incremental learning: survey and performance evaluation on image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5):5513–5533, 2022.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation, pages 109–165. 1989.
- Learning representations on the unit sphere: Application to online continual learning. arXiv preprint arXiv:2306.03364, 2023.
- Continual lifelong learning with neural networks: A review. Neural Networks, 113:54–71, 2019.
- Regularizing neural networks by penalizing confident output distributions. arXiv preprint arXiv:1701.06548, 2017.
- Learning to learn without forgetting by maximizing transfer and minimizing interference. arXiv preprint arXiv:1810.11910, 2018.
- Experience replay for continual learning. In NeurIPS, 2019.
- Incremental learning through deep adaptation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(3):651–663, 2018.
- Progressive neural networks. arXiv preprint arXiv:1606.04671, 2016.
- Overcoming catastrophic forgetting with hard attention to the task. In ICML, 2018.
- Collaborative learning for deep neural networks. In NeurIPS, 2018.
- Curriculum learning: A survey. International Journal of Computer Vision, 130(6):1526–1565, 2022.
- Rethinking the inception architecture for computer vision. In CVPR, 2016.
- Gido M Van de Ven and Andreas S Tolias. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734, 2019.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of Machine Learning Research, 9(11), 2008.
- A comprehensive survey of continual learning: Theory, method and application. arXiv preprint arXiv:2302.00487, 2023.
- Online prototype learning for online continual learning. In ICCV, 2023.
- Revisiting knowledge distillation via label smoothing regularization. In CVPR, 2020.
- Continual learning through synaptic intelligence. In ICML, 2017.
- Deep mutual learning. In CVPR, 2018.
- Knowledge distillation by on-the-fly native ensemble. In NeurIPS, 2018.
- Maorong Wang (7 papers)
- Nicolas Michel (20 papers)
- Ling Xiao (45 papers)
- Toshihiko Yamasaki (74 papers)