Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer (2405.19100v3)

Published 29 May 2024 in cs.CV

Abstract: Current facial expression recognition (FER) models are often designed in a supervised learning manner and thus are constrained by the lack of large-scale facial expression images with high-quality annotations. Consequently, these models often fail to generalize well, performing poorly on unseen images in inference. Vision-language-based zero-shot models demonstrate a promising potential for addressing such challenges. However, these models lack task-specific knowledge and therefore are not optimized for the nuances of recognizing facial expressions. To bridge this gap, this work proposes a novel method, Exp-CLIP, to enhance zero-shot FER by transferring the task knowledge from LLMs. Specifically, based on the pre-trained vision-language encoders, we incorporate a projection head designed to map the initial joint vision-language space into a space that captures representations of facial actions. To train this projection head for subsequent zero-shot predictions, we propose to align the projected visual representations with task-specific semantic meanings derived from the LLM encoder, and the text instruction-based strategy is employed to customize the LLM knowledge. Given unlabelled facial data and efficient training of the projection head, Exp-CLIP achieves superior zero-shot results to the CLIP models and several other large vision-LLMs (LVLMs) on seven in-the-wild FER datasets. The code and pre-trained models are available at https://github.com/zengqunzhao/Exp-CLIP.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (53)

Authors (4)

Zengqun Zhao (4 papers)
Yu Cao (129 papers)
Shaogang Gong (94 papers)
Ioannis Patras (73 papers)

Citations (3)

View on Semantic Scholar

GitHub

GitHub - zengqunzhao/Exp-CLIP: [arXiv'24] Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer (18 stars)

Enhancing Zero-Shot Facial Expression Recognition by LLM Knowledge Transfer (2405.19100v3)

Related Papers

GitHub