What Makes Good Few-shot Examples for Vision-Language Models? (2405.13532v1)

Published 22 May 2024 in cs.CV

Abstract: Despite the notable advancements achieved by leveraging pre-trained vision-language (VL) models through few-shot tuning for downstream tasks, our detailed empirical study highlights a significant dependence of few-shot learning outcomes on the careful selection of training examples - a facet that has been previously overlooked in research. In this study, we delve into devising more effective strategies for the meticulous selection of few-shot training examples, as opposed to relying on random sampling, to enhance the potential of existing few-shot prompt learning methodologies. To achieve this, we assess the effectiveness of various Active Learning (AL) techniques for instance selection, such as Entropy and Margin of Confidence, within the context of few-shot training. Furthermore, we introduce two innovative selection methods - Representativeness (REPRE) and Gaussian Monte Carlo (Montecarlo) - designed to proactively pinpoint informative examples for labeling in relation to pre-trained VL models. Our findings demonstrate that both REPRE and Montecarlo significantly surpass both random selection and AL-based strategies in few-shot training scenarios. The research also underscores that these instance selection methods are model-agnostic, offering a versatile enhancement to a wide array of few-shot training methodologies.

References (33)

Authors (6)

Zhaojun Guo (2 papers)
Jinghui Lu (28 papers)
Xuejing Liu (14 papers)
Rui Zhao (241 papers)
Fei Tan (25 papers)
Zhenxing Qian (54 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/CSVisionPapers/status/1794118380477198489

What Makes Good Few-shot Examples for Vision-Language Models? (2405.13532v1)

Summary

Related Papers

Tweets