Contrastive Continual Learning with Importance Sampling and Prototype-Instance Relation Distillation (2403.04599v1)
Abstract: Recently, because of the high-quality representations of contrastive learning methods, rehearsal-based contrastive continual learning has been proposed to explore how to continually learn transferable representation embeddings to avoid the catastrophic forgetting issue in traditional continual settings. Based on this framework, we propose Contrastive Continual Learning via Importance Sampling (CCLIS) to preserve knowledge by recovering previous data distributions with a new strategy for Replay Buffer Selection (RBS), which minimize estimated variance to save hard negative samples for representation learning with high quality. Furthermore, we present the Prototype-instance Relation Distillation (PRD) loss, a technique designed to maintain the relationship between prototypes and sample representations using a self-distillation process. Experiments on standard continual learning benchmarks reveal that our method notably outperforms existing baselines in terms of knowledge preservation and thereby effectively counteracts catastrophic forgetting in online contexts. The code is available at https://github.com/lijy373/CCLIS.
- Gradient based sample selection for online continual learning. Advances in neural information processing systems, 32.
- Prototype-sample relation distillation: towards replay-free continual learning. In International Conference on Machine Learning, 1093–1106. PMLR.
- Few-Shot Continual Active Learning by a Robot. Advances in Neural Information Processing Systems, 35: 30612–30624.
- Dark experience for general continual learning: a strong, simple baseline. Advances in neural information processing systems, 33: 15920–15930.
- Unsupervised learning of visual features by contrasting cluster assignments. Advances in neural information processing systems, 33: 9912–9924.
- Co2l: Contrastive continual learning. In Proceedings of the IEEE/CVF International conference on computer vision, 9516–9525.
- Riemannian walk for incremental learning: Understanding forgetting and intransigence. In Proceedings of the European conference on computer vision (ECCV), 532–547.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, 1597–1607. PMLR.
- A continual learning survey: Defying forgetting in classification tasks. IEEE transactions on pattern analysis and machine intelligence, 44(7): 3366–3385.
- Seed: Self-supervised distillation for visual representation. arXiv preprint arXiv:2101.04731.
- Self-supervised models are continual learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9621–9630.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9729–9738.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
- A survey on contrastive self-supervised learning. Technologies, 9(1): 2.
- Supervised contrastive learning. Advances in neural information processing systems, 33: 18661–18673.
- Sequential imputations and Bayesian missing data problems. Journal of the American statistical association, 89(425): 278–288.
- Learning multiple layers of features from tiny images.
- Tiny imagenet visual recognition challenge. CS 231N, 7(7): 3.
- Prototypical contrastive learning of unsupervised representations. arXiv preprint arXiv:2005.04966.
- Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12): 2935–2947.
- Liang, S. 2018. Dynamic user profiling for streams of short texts. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32.
- Liang, S. 2019. Collaborative, dynamic and diversified user profiling. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 4269–4276.
- Cross-temporal snapshot alignment for dynamic networks. IEEE Transactions on Knowledge and Data Engineering.
- Liu, J. S. 2008. Monte Carlo Strategies in Scientific Computing.
- Gradient episodic memory for continual learning. Advances in neural information processing systems, 30.
- Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation, volume 24, 109–165. Elsevier.
- Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748.
- Adaptive importance sampling for estimation in structured domains. arXiv preprint arXiv:1301.3882.
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2001–2010.
- Learning to learn without forgetting by maximizing transfer and minimizing interference. arXiv preprint arXiv:1810.11910.
- Contrastive learning with hard negative samples. arXiv preprint arXiv:2010.04592.
- Imagenet large scale visual recognition challenge. International journal of computer vision, 115: 211–252.
- Progressive neural networks. arXiv preprint arXiv:1606.04671.
- Variational approximation for importance sampling. Computational Statistics, 36(3): 1901–1930.
- Gcr: Gradient coreset based replay buffer selection for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 99–108.
- Three scenarios for continual learning. arXiv preprint arXiv:1904.07734.
- A comprehensive survey of continual learning: Theory, method and application. arXiv preprint arXiv:2302.00487.
- Online coreset selection for rehearsal-based continual learning. arXiv preprint arXiv:2106.01085.
- Variational continual Bayesian meta-learning. Advances in Neural Information Processing Systems, 34: 24556–24568.
- Jiyong Li (3 papers)
- Dilshod Azizov (7 papers)
- Yang Li (1140 papers)
- Shangsong Liang (23 papers)