Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MEmoBERT: Pre-training Model with Prompt-based Learning for Multimodal Emotion Recognition (2111.00865v1)

Published 27 Oct 2021 in cs.CV and eess.IV

Abstract: Multimodal emotion recognition study is hindered by the lack of labelled corpora in terms of scale and diversity, due to the high annotation cost and label ambiguity. In this paper, we propose a pre-training model \textbf{MEmoBERT} for multimodal emotion recognition, which learns multimodal joint representations through self-supervised learning from large-scale unlabeled video data that come in sheer volume. Furthermore, unlike the conventional "pre-train, finetune" paradigm, we propose a prompt-based method that reformulates the downstream emotion classification task as a masked text prediction one, bringing the downstream task closer to the pre-training. Extensive experiments on two benchmark datasets, IEMOCAP and MSP-IMPROV, show that our proposed MEmoBERT significantly enhances emotion recognition performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jinming Zhao (26 papers)
  2. Ruichen Li (19 papers)
  3. Qin Jin (94 papers)
  4. Xinchao Wang (203 papers)
  5. Haizhou Li (286 papers)
Citations (22)

Summary

We haven't generated a summary for this paper yet.