Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large Language Model (2403.08350v2)

Published 13 Mar 2024 in cs.CV

Abstract: Instruction tuning represents a prevalent strategy employed by Multimodal LLMs (MLLMs) to align with human instructions and adapt to new tasks. Nevertheless, MLLMs encounter the challenge of adapting to users' evolving knowledge and demands. Therefore, how to retain existing skills while acquiring new knowledge needs to be investigated. In this paper, we present a comprehensive benchmark, namely Continual Instruction tuNing (CoIN), to assess existing MLLMs in the sequential instruction tuning paradigm. CoIN comprises 10 commonly used datasets spanning 8 task categories, ensuring a diverse range of instructions and tasks. Besides, the trained model is evaluated from two aspects: Instruction Following and General Knowledge, which assess the alignment with human intention and knowledge preserved for reasoning, respectively. Experiments on CoIN demonstrate that current powerful MLLMs still suffer catastrophic forgetting, and the failure in intention alignment assumes the main responsibility, instead of the knowledge forgetting. To this end, we introduce MoELoRA to MLLMs which is effective to retain the previous instruction alignment. Experimental results consistently illustrate the forgetting decreased from this method on CoIN.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Cheng Chen (262 papers)
  2. Junchen Zhu (9 papers)
  3. Xu Luo (22 papers)
  4. Hengtao Shen (16 papers)
  5. Lianli Gao (99 papers)
  6. Jingkuan Song (115 papers)
Citations (7)
X Twitter Logo Streamline Icon: https://streamlinehq.com