An Empirical Study of Parameter Efficient Fine-tuning on Vision-Language Pre-train Model (2403.08433v2)

Published 13 Mar 2024 in cs.CV

Abstract: Recent studies applied Parameter Efficient Fine-Tuning techniques (PEFTs) to efficiently narrow the performance gap between pre-training and downstream. There are two important factors for various PEFTs, namely, the accessible data size and fine-tunable parameter size. A natural expectation for PEFTs is that the performance of various PEFTs is positively related to the data size and fine-tunable parameter size. However, according to the evaluation of five PEFTs on two downstream vision-language (VL) tasks, we find that such an intuition holds only if the downstream data and task are not consistent with pre-training. For downstream fine-tuning consistent with pre-training, data size no longer affects the performance, while the influence of fine-tunable parameter size is not monotonous. We believe such an observation could guide the choice of training strategy for various PEFTs.

References (23)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/CSVisionPapers/status/1768496698324549821

An Empirical Study of Parameter Efficient Fine-tuning on Vision-Language Pre-train Model (2403.08433v2)

Summary

Related Papers

Tweets