Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling (2310.12100v1)

Published 18 Oct 2023 in cs.CL, cs.AI, cs.CV, cs.LG, and cs.MM

Abstract: LLMs and vision LLMs (VLMs) demonstrate excellent performance on a wide range of tasks by scaling up parameter counts from O(109) to O(10{12}) levels and further beyond. These large scales make it impossible to adapt and deploy fully specialized models given a task of interest. Parameter-efficient fine-tuning (PEFT) emerges as a promising direction to tackle the adaptation and serving challenges for such large models. We categorize PEFT techniques into two types: intrusive and non-intrusive. Intrusive PEFT techniques directly change a model's internal architecture. Though more flexible, they introduce significant complexities for training and serving. Non-intrusive PEFT techniques leave the internal architecture unchanged and only adapt model-external parameters, such as embeddings for input. In this work, we describe AdaLink as a non-intrusive PEFT technique that achieves competitive performance compared to SoTA intrusive PEFT (LoRA) and full model fine-tuning (FT) on various tasks. We evaluate using both text-only and multimodal tasks, with experiments that account for both parameter-count scaling and training regime (with and without instruction tuning).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Yaqing Wang (59 papers)
  2. Jialin Wu (30 papers)
  3. Tanmaya Dabral (1 paper)
  4. Jiageng Zhang (6 papers)
  5. Geoff Brown (6 papers)
  6. Chun-Ta Lu (20 papers)
  7. Frederick Liu (27 papers)
  8. Yi Liang (58 papers)
  9. Bo Pang (77 papers)
  10. Michael Bendersky (63 papers)
  11. Radu Soricut (54 papers)
Citations (11)