Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Convolutional Prompting meets Language Models for Continual Learning (2403.20317v1)

Published 29 Mar 2024 in cs.CV

Abstract: Continual Learning (CL) enables machine learning models to learn from continuously shifting new training data in absence of data from old tasks. Recently, pretrained vision transformers combined with prompt tuning have shown promise for overcoming catastrophic forgetting in CL. These approaches rely on a pool of learnable prompts which can be inefficient in sharing knowledge across tasks leading to inferior performance. In addition, the lack of fine-grained layer specific prompts does not allow these to fully express the strength of the prompts for CL. We address these limitations by proposing ConvPrompt, a novel convolutional prompt creation mechanism that maintains layer-wise shared embeddings, enabling both layer-specific learning and better concept transfer across tasks. The intelligent use of convolution enables us to maintain a low parameter overhead without compromising performance. We further leverage LLMs to generate fine-grained text descriptions of each category which are used to get task similarity and dynamically decide the number of prompts to be learned. Extensive experiments demonstrate the superiority of ConvPrompt and improves SOTA by ~3% with significantly less parameter overhead. We also perform strong ablation over various modules to disentangle the importance of different components.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Memory aware synapses: Learning what (not) to forget. In The European Conference on Computer Vision (ECCV), 2018.
  2. Task-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11254–11263, 2019a.
  3. Gradient based sample selection for online continual learning. Advances in neural information processing systems, 32, 2019b.
  4. Rainbow memory: Continual learning with a memory of diverse samples. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8218–8227, 2021.
  5. Language Models are Few-Shot Learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  6. Rethinking experience replay: a bag of tricks for continual learning. 2020 25th International Conference on Pattern Recognition (ICPR), pages 2180–2187, 2021.
  7. On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486, 2019a.
  8. Continual learning with tiny episodic memories. In International Conference on Machine Learning, 2019b.
  9. Promptfusion: Decoupling stability and plasticity for continual learning. arXiv preprint arXiv:2303.07223, 2023.
  10. Multimodal parameter-efficient few-shot class incremental learning. arXiv preprint arXiv:2303.04751, 2023.
  11. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021.
  12. Dytox: Transformers for continual learning with dynamic token expansion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  13. Verma et. al. Efficient feature transformations for discriminative and generative continual learning. In CVPR, 2021.
  14. Orthogonal gradient descent for continual learning. In International Conference on Artificial Intelligence and Statistics, pages 3762–3773. PMLR, 2020.
  15. An empirical investigation of catastrophic forgetting in gradient-based neural networks, 2015.
  16. Remind your Neural Network to Prevent Catastrophic Forgetting. In European Conference on Computer Vision, pages 466–483. Springer, 2020.
  17. Exemplar-free Online Continual Learning. arXiv preprint arXiv:2202.05491, 2022.
  18. The many faces of robustness: A critical analysis of out-of-distribution generalization. ICCV, 2021.
  19. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
  20. Pop: Prompt of prompts for continual learning, 2023.
  21. Selective experience replay for lifelong learning. In Proceedings of the AAAI Conference on Artificial Intelligence, 2018.
  22. Birt: Bio-inspired replay in vision transformers for continual learning. arXiv preprint arXiv:2305.04769, 2023.
  23. Introducing language guidance in prompt-based continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11463–11473, 2023.
  24. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
  25. Alex Krizhevsky et al. Learning multiple layers of features from tiny images. Citeseer, 2009.
  26. Overcoming catastrophic forgetting by incremental moment matching. Advances in neural information processing systems, 30, 2017.
  27. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017.
  28. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys, 55(9):1–35, 2023.
  29. Gradient Episodic Memory for Continual Learning. Advances in neural information processing systems, 30, 2017.
  30. Doubly right object recognition: A why prompt for visual rationales. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2722–2732, 2023.
  31. Catastrophic interference in connectionist networks: The sequential learning problem. Psychology of Learning and Motivation, 24:109–165, 1989.
  32. Visual classification via description from large language models. arXiv preprint arXiv:2210.07183, 2022.
  33. Architecture matters in continual learning. arXiv, 2022.
  34. Online class incremental learning on stochastic blurry task boundary via mask and visual prompt tuning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11731–11741, 2023.
  35. Rectified linear units improve restricted boltzmann machines. In International Conference on Machine Learning, page 807–814, Madison, WI, USA, 2010. Omnipress.
  36. GDumb: A Simple Approach that Questions our Progress in Continual Learning. In European conference on computer vision, pages 524–540. Springer, 2020.
  37. Progressive prompts: Continual learning for language models. In International Conference on Learning Representations, 2023.
  38. icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017.
  39. Imagenet-21k pretraining for the masses. arXiv, 2021.
  40. Anthony V. Robins. Catastrophic forgetting, rehearsal and pseudorehearsal. Connect. Sci., 7:123–146, 1995.
  41. Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, 2019a.
  42. Experience replay for continual learning. In Advances in Neural Information Processing Systems, 2019b.
  43. Exemplar-free continual transformer with convolutions. In International Conference on Computer Vision (ICCV), 2023.
  44. Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3), 2015.
  45. Gradient projection memory for continual learning. In International Conference on Learning Representations, 2021.
  46. Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning. arXiv preprint arXiv:2211.13218, 2022. Accepted for publication at CVPR 2023.
  47. When prompt-based incremental learning does not meet strong pretraining. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1706–1716, 2023.
  48. Convolutional Visual Prompt for Robust Visual Perception. In Neural Information Processing Systems, 2023.
  49. Efficient feature transformations for discriminative and generative continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13865–13875, 2021.
  50. The caltech-ucsd birds-200-2011 dataset. In California Institute of Technology, 2011.
  51. S-prompts learning with pre-trained transformers: An occam’s razor for domain incremental learning. In Conference on Neural Information Processing Systems (NeurIPS), 2022a.
  52. Continual learning with lifelong vision transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 171–181, 2022b.
  53. DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning. In European Conference on Computer Vision, 2022c.
  54. Learning to Prompt for Continual Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 139–149, 2022d.
  55. Meta-attention for vit-backed continual learning. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  56. Der: Dynamically expandable representation for class incremental learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  57. Lifelong learning with dynamically expandable networks. In International Conference on Learning Representations. ICLR, 2018.
  58. Continual learning through synaptic intelligence. In International conference on machine learning, pages 3987–3995. PMLR, 2017a.
  59. Continual learning through synaptic intelligence. In International Conference on Machine Learning, pages 3987–3995. PMLR, 2017b.
  60. Slca: Slow learner with classifier alignment for continual learning on a pre-trained model. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Anurag Roy (5 papers)
  2. Riddhiman Moulick (1 paper)
  3. Saptarshi Ghosh (82 papers)
  4. Abir Das (20 papers)
  5. Vinay K. Verma (2 papers)
Citations (7)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com