Manipulating the Label Space for In-Context Classification (2312.00351v2)

Published 1 Dec 2023 in cs.CV

Abstract: After pre-training by generating the next word conditional on previous words, the LLM (LM) acquires the ability of In-Context Learning (ICL) that can learn a new task conditional on the context of the given in-context examples (ICEs). Similarly, visually-conditioned LLMling is also used to train Vision-LLMs (VLMs) with ICL ability. However, such VLMs typically exhibit weaker classification abilities compared to contrastive learning-based models like CLIP, since the LLMling objective does not directly contrast whether an object is paired with a text. To improve the ICL of classification, using more ICEs to provide more knowledge is a straightforward way. However, this may largely increase the selection time, and more importantly, the inclusion of additional in-context images tends to extend the length of the in-context sequence beyond the processing capacity of a VLM. To alleviate these limitations, we propose to manipulate the label space of each ICE to increase its knowledge density, allowing for fewer ICEs to convey as much information as a larger set would. Specifically, we propose two strategies which are Label Distribution Enhancement and Visual Descriptions Enhancement to improve In-context classification performance on diverse datasets, including the classic ImageNet and more fine-grained datasets like CUB-200. Specifically, using our approach on ImageNet, we increase accuracy from 74.70\% in a 4-shot setting to 76.21\% with just 2 shots. surpassing CLIP by 0.67\%. On CUB-200, our method raises 1-shot accuracy from 48.86\% to 69.05\%, 12.15\% higher than CLIP. The code is given in https://anonymous.4open.science/r/MLS_ICC.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (6)

Haokun Chen (26 papers)
Xu Yang (222 papers)
Yuhang Huang (14 papers)
Zihan Wu (18 papers)
Jing Wang (740 papers)
Xin Geng (90 papers)

Citations (2)

View on Semantic Scholar

Manipulating the Label Space for In-Context Classification (2312.00351v2)

Related Papers