An Empirical Study of Parameter Efficient Fine-tuning on Vision-Language Pre-train Model (2403.08433v2)

Published 13 Mar 2024 in cs.CV

Abstract: Recent studies applied Parameter Efficient Fine-Tuning techniques (PEFTs) to efficiently narrow the performance gap between pre-training and downstream. There are two important factors for various PEFTs, namely, the accessible data size and fine-tunable parameter size. A natural expectation for PEFTs is that the performance of various PEFTs is positively related to the data size and fine-tunable parameter size. However, according to the evaluation of five PEFTs on two downstream vision-language (VL) tasks, we find that such an intuition holds only if the downstream data and task are not consistent with pre-training. For downstream fine-tuning consistent with pre-training, data size no longer affects the performance, while the influence of fine-tunable parameter size is not monotonous. We believe such an observation could guide the choice of training strategy for various PEFTs.

References (23)

Authors (7)

Yuxin Tian (26 papers)
Mouxing Yang (9 papers)
Yunfan Li (26 papers)
Dayiheng Liu (75 papers)
Xingzhang Ren (13 papers)
Xi Peng (115 papers)
Jiancheng Lv (99 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/CSVisionPapers/status/1768496698324549821

An Empirical Study of Parameter Efficient Fine-tuning on Vision-Language Pre-train Model (2403.08433v2)

Summary

Related Papers

Tweets