An Analysis of Initial Training Strategies for Exemplar-Free Class-Incremental Learning (2308.11677v2)
Abstract: Class-Incremental Learning (CIL) aims to build classification models from data streams. At each step of the CIL process, new classes must be integrated into the model. Due to catastrophic forgetting, CIL is particularly challenging when examples from past classes cannot be stored, the case on which we focus here. To date, most approaches are based exclusively on the target dataset of the CIL process. However, the use of models pre-trained in a self-supervised way on large amounts of data has recently gained momentum. The initial model of the CIL process may only use the first batch of the target dataset, or also use pre-trained weights obtained on an auxiliary dataset. The choice between these two initial learning strategies can significantly influence the performance of the incremental learning model, but has not yet been studied in depth. Performance is also influenced by the choice of the CIL algorithm, the neural architecture, the nature of the target task, the distribution of classes in the stream and the number of examples available for learning. We conduct a comprehensive experimental study to assess the roles of these factors. We present a statistical analysis framework that quantifies the relative contribution of each factor to incremental performance. Our main finding is that the initial training strategy is the dominant factor influencing the average incremental accuracy, but that the choice of CIL algorithm is more important in preventing forgetting. Based on this analysis, we propose practical recommendations for choosing the right initial training strategy for a given incremental learning use case. These recommendations are intended to facilitate the practical deployment of incremental learning.
- Exploring the limits of large scale pre-training. In International Conference on Learning Representations, 2022.
- Few-shot class incremental learning leveraging self-supervised features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3900–3910, 2022.
- Hirotogu Akaike. Information Theory and an Extension of the Maximum Likelihood Principle, pages 199–213. Springer New York, New York, NY, 1998.
- Mostly harmless econometrics: An empiricist’s companion. Princeton university press, 2009.
- Deesil: Deep-shallow incremental learning. TaskCV Workshop @ ECCV 2018., 2018.
- A comprehensive study of class incremental learning algorithms for visual tasks. Neural Networks, 135:38–54, 2021.
- Food-101 – mining discriminative components with random forests. In European Conference on Computer Vision, 2014.
- End-to-end incremental learning. In Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XII, pages 241–257, 2018.
- An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9640–9649, 2021.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA, pages 248–255, 2009.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
- Podnet: Pooled outputs distillation for small-tasks incremental learning. In Computer vision-ECCV 2020-16th European conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XX, volume 12365, pages 86–102. Springer, 2020.
- Dytox: Transformers for continual learning with dynamic token expansion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9285–9295, June 2022.
- Advisil - a class-incremental learning advisor. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 2400–2409, January 2023.
- Self-supervised models are continual learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9621–9630, 2022.
- Robert M French. Catastrophic forgetting in connectionist networks. Trends in cognitive sciences, 3(4):128–135, 1999.
- Self-supervised training enhances online continual learning. In British Machine Vision Conference (BMVC), 2020.
- An introduction to statistical learning: with applications in R. Spinger, 2013.
- Generalisation in humans and deep neural networks. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
- Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33:21271–21284, 2020.
- A neural representation of sketch drawings. CoRR, abs/1704.03477, 2017.
- Lifelong machine learning with deep streaming linear discriminant analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 220–221, 2020.
- Online continual learning for embedded devices. In Conference on Lifelong Learning Agents, pages 744–766. PMLR, 2022.
- Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
- Deep residual learning for image recognition. In Conference on Computer Vision and Pattern Recognition, CVPR, 2016.
- Distilling the knowledge in a neural network. CoRR, abs/1503.02531, 2015.
- Learning a unified classifier incrementally via rebalancing. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 831–839, 2019.
- A simple baseline that questions the use of pretrained-models in continual learning. In NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and Applications, 2022.
- Self-supervised visual feature learning with deep neural networks: A survey. IEEE transactions on pattern analysis and machine intelligence, 43(11):4037–4058, 2020.
- Balanced softmax cross-entropy for incremental learning. In Artificial Neural Networks and Machine Learning–ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, September 14–17, 2021, Proceedings, Part II, pages 385–396. Springer, 2021.
- Do better imagenet models transfer better? CoRR, abs/1805.08974, 2018.
- Fine-tuning can distort pretrained features and underperform out-of-distribution. In International Conference on Learning Representations, 2022.
- Continual learning: A comparative study on how to defy forgetting in classification tasks. CoRR, abs/1909.08383, 2019.
- Learning without forgetting. In European Conference on Computer Vision, ECCV, 2016.
- Heterogeneous continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15985–15995, 2023.
- Malaysian traffic sign dataset for traffic sign detection and recognition systems. Journal of Telecommunication, Electronic and Computer Engineering (JTEC), 8(11):137–143, 2016.
- Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013.
- Class-incremental learning: survey and performance evaluation on image classification, 2021.
- Catastrophic interference in connectionist networks: The sequential learning problem. The Psychology of Learning and Motivation, 24:104–169, 1989.
- The stability-plasticity dilemma: investigating the continuum from catastrophic forgetting to age-limited learning effects. Frontiers in Psychology, 4:504–504, 2013.
- Wide neural networks forget less catastrophically. In International Conference on Machine Learning, pages 15699–15717. PMLR, 2022.
- Large-scale image retrieval with attentive deep local features. In Proceedings of the IEEE international conference on computer vision, pages 3456–3465, 2017.
- Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023.
- Continual learning with foundation models: An empirical study of latent replay. In Sarath Chandar, Razvan Pascanu, and Doina Precup, editors, Proceedings of The 1st Conference on Lifelong Learning Agents, volume 199 of Proceedings of Machine Learning Research, pages 60–91. PMLR, 22–24 Aug 2022.
- Continual lifelong learning with neural networks: A review. Neural Networks, 113, 2019.
- Francesco Pelosin. Simpler is better: off-the-shelf continual learning through pretrained backbones. arXiv preprint arXiv:2205.01586, 2022.
- Fetril: Feature translation for exemplar-free class-incremental learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3911–3920, January 2023.
- An impartial take to the cnn vs transformer robustness contest. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XIII, pages 466–480. Springer, 2022.
- icarl: Incremental classifier and representation learning. In Conference on Computer Vision and Pattern Recognition, CVPR, 2017.
- A survey of transfer learning for convolutional neural networks. In 2019 32nd SIBGRAPI Conference on Graphics, Patterns and Images Tutorials (SIBGRAPI-T), pages 47–57, 2019.
- Mark B Ring. Child: A first step towards continual learning. Machine Learning, 28(1):77–104, 1997.
- Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211–252, 2015.
- Large-scale classification of fine-art paintings: Learning the right metric on the right feature. arXiv preprint arXiv:1505.00855, 2015.
- Learning more universal representations for transfer-learning. CoRR, abs/1712.09708, 2017.
- A survey on deep transfer learning. In International conference on artificial neural networks, pages 270–279. Springer, 2018.
- Practical self-supervised continual learning with continual fine-tuning. arXiv preprint arXiv:2303.17235, 2023.
- Continuous transfer of neural network representational similarity for incremental learning. Neurocomputing, 545:126300, 2023.
- Training data-efficient image transformers & distillation through attention. In International conference on machine learning, pages 10347–10357. PMLR, 2021.
- Gido M Van de Ven and Andreas S Tolias. Three scenarios for continual learning. arXiv preprint arXiv:1904.07734, 2019.
- The inaturalist species classification and detection dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8769–8778, 2018.
- Logo-2k+: A large-scale logo dataset for scalable logo classification. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34 (4), pages 6194–6201, 2020.
- Growing a brain: Fine-tuning by increasing model capacity. In Conference on Computer Vision and Pattern Recognition, CVPR, 2017.
- Can CNNs be more robust than transformers? In The Eleventh International Conference on Learning Representations, 2023.
- Learning to prompt for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 139–149, 2022.
- Google landmarks dataset v2-a large-scale benchmark for instance-level recognition and retrieval. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2575–2584, 2020.
- Striking a balance between stability and plasticity for class-incremental learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1124–1133, 2021.
- Class-incremental learning with strong pre-trained models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9601–9610, June 2022.
- Learning face representation from scratch. arXiv preprint arXiv:1411.7923, 2014.
- Class-incremental learning via dual augmentation. Advances in Neural Information Processing Systems, 34, 2021.
- Prototype augmentation and self-supervision for incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5871–5880, 2021.
- Self-sustaining representation expansion for non-exemplar class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9296–9305, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.