Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multimodal Representations Learning Based on Mutual Information Maximization and Minimization and Identity Embedding for Multimodal Sentiment Analysis (2201.03969v2)

Published 10 Jan 2022 in cs.LG, cs.CL, and cs.CV

Abstract: Multimodal sentiment analysis (MSA) is a fundamental complex research problem due to the heterogeneity gap between different modalities and the ambiguity of human emotional expression. Although there have been many successful attempts to construct multimodal representations for MSA, there are still two challenges to be addressed: 1) A more robust multimodal representation needs to be constructed to bridge the heterogeneity gap and cope with the complex multimodal interactions, and 2) the contextual dynamics must be modeled effectively throughout the information flow. In this work, we propose a multimodal representation model based on Mutual information Maximization and Minimization and Identity Embedding (MMMIE). We combine mutual information maximization between modal pairs, and mutual information minimization between input data and corresponding features to mine the modal-invariant and task-related information. Furthermore, Identity Embedding is proposed to prompt the downstream network to perceive the contextual information. Experimental results on two public datasets demonstrate the effectiveness of the proposed model.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jiahao Zheng (9 papers)
  2. Sen Zhang (86 papers)
  3. Xiaoping Wang (56 papers)
  4. Zhigang Zeng (28 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.