FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models? (2307.04114v1)

Published 9 Jul 2023 in cs.LG, cs.AI, cs.CL, cs.CV, and cs.MM

Abstract: Few-shot learning aims to train models that can be generalized to novel classes with only a few samples. Recently, a line of works are proposed to enhance few-shot learning with accessible semantic information from class names. However, these works focus on improving existing modules such as visual prototypes and feature extractors of the standard few-shot learning framework. This limits the full potential use of semantic information. In this paper, we propose a novel few-shot learning framework that uses pre-trained LLMs based on contrastive learning. To address the challenge of alignment between visual features and textual embeddings obtained from text-based pre-trained LLM, we carefully design the textual branch of our framework and introduce a metric module to generalize the cosine similarity. For better transferability, we let the metric module adapt to different few-shot tasks and adopt MAML to train the model via bi-level optimization. Moreover, we conduct extensive experiments on multiple benchmarks to demonstrate the effectiveness of our method.

References (67)

Authors (5)

Zihao Jiang (12 papers)
Yunkai Dang (5 papers)
Dong Pang (1 paper)
Huishuai Zhang (64 papers)
Weiran Huang (54 papers)

Citations (3)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models? (2307.04114v1)

Summary

Related Papers