Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Pre-trained Vision and Language Transformers Are Few-Shot Incremental Learners (2404.02117v1)

Published 2 Apr 2024 in cs.CV

Abstract: Few-Shot Class Incremental Learning (FSCIL) is a task that requires a model to learn new classes incrementally without forgetting when only a few samples for each class are given. FSCIL encounters two significant challenges: catastrophic forgetting and overfitting, and these challenges have driven prior studies to primarily rely on shallow models, such as ResNet-18. Even though their limited capacity can mitigate both forgetting and overfitting issues, it leads to inadequate knowledge transfer during few-shot incremental sessions. In this paper, we argue that large models such as vision and language transformers pre-trained on large datasets can be excellent few-shot incremental learners. To this end, we propose a novel FSCIL framework called PriViLege, Pre-trained Vision and Language transformers with prompting functions and knowledge distillation. Our framework effectively addresses the challenges of catastrophic forgetting and overfitting in large models through new pre-trained knowledge tuning (PKT) and two losses: entropy-based divergence loss and semantic knowledge distillation loss. Experimental results show that the proposed PriViLege significantly outperforms the existing state-of-the-art methods with a large margin, e.g., +9.38% in CUB200, +20.58% in CIFAR-100, and +13.36% in miniImageNet. Our implementation code is available at https://github.com/KHU-AGI/PriViLege.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Label-embedding for attribute-based classification. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), 2013.
  2. Subspace regularizers for few-shot class incremental learning. In International Conference on Learning Representations (ICLR), 2022.
  3. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2021.
  4. A unified continual learning framework with general parameter-efficient tuning. International Conference on Computer Vision (ICCV), 2023.
  5. Towards a unified view of parameter-efficient transfer learning. In International Conference on Learning Representations, 2022.
  6. Constrained few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
  7. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  8. Zitong Huang et al. Learning prompt with distribution-based feature replay for few-shot class-incremental learning. arXiv preprint, 2024.
  9. Generating instance-level prompts for rehearsal-free continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
  10. Jacob Devlin Ming-Wei Chang Kenton and Lee Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, 2019.
  11. Introducing language guidance in prompt-based continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  12. Warping the space: Weight space rotation for class-incremental few-shot learning. In The Eleventh International Conference on Learning Representations (ICLR), 2023.
  13. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  14. Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.
  15. On information and sufficiency. The annals of mathematical statistics, 1951.
  16. The power of scale for parameter-efficient prompt tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021.
  17. Prefix-tuning: Optimizing continuous prompts for generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 2021.
  18. Sgdr: Stochastic gradient descent with warm restarts. In International Conference on Learning Representations (ICLR), 2017.
  19. Few-shot lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021.
  20. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. 1989.
  21. Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
  22. Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences, 2020.
  23. Few-shot class-incremental learning from an open-set perspective. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXV, 2022.
  24. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014.
  25. A review of generalized zero-shot learning methods. IEEE transactions on pattern analysis and machine intelligence (PAMI), 2022.
  26. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML), 2021.
  27. Optimization as a model for few-shot learning. In International conference on learning representations (ICLR), 2017.
  28. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV), 2015.
  29. Lfs-gan: Lifelong few-shot image generation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
  30. Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima. Advances in Neural Information Processing Systems (NeurIPS), 2021.
  31. Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  32. When prompt-based incremental learning does not meet strong pretraining. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2023.
  33. Few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020.
  34. A survey on few-shot class-incremental learning. arXiv preprint arXiv:2304.08130, 2023.
  35. The caltech-ucsd birds-200-2011 dataset. 2011.
  36. Dualprompt: Complementary prompting for rehearsal-free continual learning. In European Conference on Computer Vision (ECCV), 2022a.
  37. Learning to prompt for continual learning. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022b.
  38. Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  39. Neural collapse inspired feature-classifier alignment for few-shot class-incremental learning. In The Eleventh International Conference on Learning Representations, 2022.
  40. Few-shot incremental learning with continually evolved classifiers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2021.
  41. Forward compatible few-shot class-incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022a.
  42. Few-shot class-incremental learning by sampling multi-phase tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 2022b.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Keon-Hee Park (5 papers)
  2. Kyungwoo Song (38 papers)
  3. Gyeong-Moon Park (20 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.