Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling (2302.06605v2)

Published 13 Feb 2023 in cs.CV and cs.CL

Abstract: Large-scale vision-language pre-trained models have shown promising transferability to various downstream tasks. As the size of these foundation models and the number of downstream tasks grow, the standard full fine-tuning paradigm becomes unsustainable due to heavy computational and storage costs. This paper proposes UniAdapter, which unifies unimodal and multimodal adapters for parameter-efficient cross-modal adaptation on pre-trained vision-LLMs. Specifically, adapters are distributed to different modalities and their interactions, with the total number of tunable parameters reduced by partial weight sharing. The unified and knowledge-sharing design enables powerful cross-modal representations that can benefit various downstream tasks, requiring only 1.0%-2.0% tunable parameters of the pre-trained model. Extensive experiments on 6 cross-modal downstream benchmarks (including video-text retrieval, image-text retrieval, VideoQA, and VQA) show that in most cases, UniAdapter not only outperforms the state-of-the-arts, but even beats the full fine-tuning strategy. Particularly, on the MSRVTT retrieval task, UniAdapter achieves 49.7% recall@1 with 2.2% model parameters, outperforming the latest competitors by 2.0%. The code and models are available at https://github.com/RERV/UniAdapter.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Haoyu Lu (24 papers)
  2. Yuqi Huo (19 papers)
  3. Guoxing Yang (11 papers)
  4. Zhiwu Lu (51 papers)
  5. Wei Zhan (130 papers)
  6. Masayoshi Tomizuka (261 papers)
  7. Mingyu Ding (82 papers)
Citations (23)

Summary

We haven't generated a summary for this paper yet.