Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Class Incremental Learning with Pre-trained Vision-Language Models (2310.20348v1)

Published 31 Oct 2023 in cs.CV and cs.LG

Abstract: With the advent of large-scale pre-trained models, interest in adapting and exploiting them for continual learning scenarios has grown. In this paper, we propose an approach to exploiting pre-trained vision-LLMs (e.g. CLIP) that enables further adaptation instead of only using zero-shot learning of new tasks. We augment a pre-trained CLIP model with additional layers after the Image Encoder or before the Text Encoder. We investigate three different strategies: a Linear Adapter, a Self-attention Adapter, each operating on the image embedding, and Prompt Tuning which instead modifies prompts input to the CLIP text encoder. We also propose a method for parameter retention in the adapter layers that uses a measure of parameter importance to better maintain stability and plasticity during incremental learning. Our experiments demonstrate that the simplest solution -- a single Linear Adapter layer with parameter retention -- produces the best results. Experiments on several conventional benchmarks consistently show a significant margin of improvement over the current state-of-the-art.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xialei Liu (35 papers)
  2. Xusheng Cao (5 papers)
  3. Haori Lu (6 papers)
  4. Andrew D. Bagdanov (46 papers)
  5. Ming-Ming Cheng (185 papers)
  6. Jia-Wen Xiao (4 papers)
Citations (9)