Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners (2404.02117v1)
Abstract: Few-Shot Class Incremental Learning (FSCIL) is a task that requires a model to learn new classes incrementally without forgetting when only a few samples for each class are given. FSCIL encounters two significant challenges: catastrophic forgetting and overfitting, and these challenges have driven prior studies to primarily rely on shallow models, such as ResNet-18. Even though their limited capacity can mitigate both forgetting and overfitting issues, it leads to inadequate knowledge transfer during few-shot incremental sessions. In this paper, we argue that large models such as vision and language transformers pre-trained on large datasets can be excellent few-shot incremental learners. To this end, we propose a novel FSCIL framework called PriViLege, Pre-trained Vision and Language transformers with prompting functions and knowledge distillation. Our framework effectively addresses the challenges of catastrophic forgetting and overfitting in large models through new pre-trained knowledge tuning (PKT) and two losses: entropy-based divergence loss and semantic knowledge distillation loss. Experimental results show that the proposed PriViLege significantly outperforms the existing state-of-the-art methods with a large margin, e.g., +9.38% in CUB200, +20.58% in CIFAR-100, and +13.36% in miniImageNet. Our implementation code is available at https://github.com/KHU-AGI/PriViLege.
- Label-embedding for attribute-based classification. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2013.
- Subspace regularizers for few-shot class incremental learning. In International Conference on Learning Representations (ICLR), 2022.
- An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2021.
- A unified continual learning framework with general parameter-efficient tuning. International Conference on Computer Vision (ICCV), 2023.
- Towards a unified view of parameter-efficient transfer learning. In International Conference on Learning Representations, 2022.
- Constrained few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- Zitong Huang et al. Learning prompt with distribution-based feature replay for few-shot class-incremental learning. arXiv preprint, 2024.
- Generating instance-level prompts for rehearsal-free continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
- Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, 2019.
- Introducing language guidance in prompt-based continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- Warping the space: Weight space rotation for class-incremental few-shot learning. In The Eleventh International Conference on Learning Representations (ICLR), 2023.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.
- On information and sufficiency. The annals of mathematical statistics, 1951.
- The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
- Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 2021.
- Sgdr: Stochastic gradient descent with warm restarts. In International Conference on Learning Representations (ICLR), 2017.
- Few-shot lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021.
- Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. 1989.
- Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences, 2020.
- Few-shot class-incremental learning from an open-set perspective. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXV, 2022.
- Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.
- A review of generalized zero-shot learning methods. IEEE transactions on pattern analysis and machine intelligence (PAMI), 2022.
- Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML), 2021.
- Optimization as a model for few-shot learning. In International conference on learning representations (ICLR), 2017.
- ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 2015.
- Lfs-gan: Lifelong few-shot image generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
- Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima. Advances in Neural Information Processing Systems (NeurIPS), 2021.
- Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
- When prompt-based incremental learning does not meet strong pretraining. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2023.
- Few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
- A survey on few-shot class-incremental learning. arXiv preprint arXiv:2304.08130, 2023.
- The caltech-ucsd birds-200-2011 dataset. 2011.
- Dualprompt: Complementary prompting for rehearsal-free continual learning. In European Conference on Computer Vision (ECCV), 2022a.
- Learning to prompt for continual learning. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022b.
- Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- Neural collapse inspired feature-classifier alignment for few-shot class-incremental learning. In The Eleventh International Conference on Learning Representations, 2022.
- Few-shot incremental learning with continually evolved classifiers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2021.
- Forward compatible few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022a.
- Few-shot class-incremental learning by sampling multi-phase tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2022b.
- Keon-Hee Park (5 papers)
- Kyungwoo Song (38 papers)
- Gyeong-Moon Park (20 papers)