Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Two-Stage Multimodal Emotion Recognition Model Based on Graph Contrastive Learning (2401.01495v1)

Published 3 Jan 2024 in cs.CL

Abstract: In terms of human-computer interaction, it is becoming more and more important to correctly understand the user's emotional state in a conversation, so the task of multimodal emotion recognition (MER) started to receive more attention. However, existing emotion classification methods usually perform classification only once. Sentences are likely to be misclassified in a single round of classification. Previous work usually ignores the similarities and differences between different morphological features in the fusion process. To address the above issues, we propose a two-stage emotion recognition model based on graph contrastive learning (TS-GCL). First, we encode the original dataset with different preprocessing modalities. Second, a graph contrastive learning (GCL) strategy is introduced for these three modal data with other structures to learn similarities and differences within and between modalities. Finally, we use MLP twice to achieve the final emotion classification. This staged classification method can help the model to better focus on different levels of emotional information, thereby improving the performance of the model. Extensive experiments show that TS-GCL has superior performance on IEMOCAP and MELD datasets compared with previous methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (11)
  1. S. Qian, D. Xue, Q. Fang, and C. Xu, “Integrating multi-label contrastive learning with dual adversarial graph neural networks for cross-modal retrieval,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1–18, 2022.
  2. Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert pretraining approach,” arXiv preprint arXiv:1907.11692, 2019.
  3. N. Majumder, S. Poria, D. Hazarika, R. Mihalcea, A. Gelbukh, and E. Cambria, “Dialoguernn: An attentive rnn for emotion detection in conversations,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 6818–6825.
  4. Z. Li, F. Tang, M. Zhao, and Y. Zhu, “Emocaps: Emotion capsule based model for conversational emotion recognition,” in Findings of the Association for Computational Linguistics: ACL 2022, 2022, pp.1610–1618.
  5. F. Huang, X. Li, C. Yuan, S. Zhang, J. Zhang, and S. Qiao, “Attention-emotion-enhanced convolutional lstm for sentiment analysis,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 9,pp. 4332–4345, 2022.
  6. A. Zadeh, M. Chen, S. Poria, E. Cambria, and L.-P. Morency, “Tensor fusion network for multimodal sentiment analysis,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017, pp. 1103–1114.
  7. J. Hu, Y. Liu, J. Zhao, and Q. Jin, “Mmgcn: Multimodal fusion via deep graph convolution network for emotion recognition in conversation,”in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2021, pp.5666–5675.
  8. S. Poria, E. Cambria, D. Hazarika, N. Majumder, A. Zadeh, and L.P. Morency, “Context-dependent sentiment analysis in user-generated videos,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, 2017, pp. 873–883.
  9. D. Hazarika, S. Poria, A. Zadeh, E. Cambria, L.-P. Morency, and R. Zimmermann, “Conversational memory network for emotion recognition in dyadic dialogue videos,” in Proceedings of the Conference of the North American Chapter ofthe Association for Computational Linguistics: Human Language Technologies, 2018, pp. 2122–2132.
  10. M. Li, B. Yang, J. Levy, A. Stolcke, V. Rozgic, S. Matsoukas, C. Papayiannis, D. Bone, and C. Wang, “Contrastive unsupervised learning for speech emotion recognition,” in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).IEEE, 2021, pp. 6329–6333.
  11. D. Kim and B. C. Song, “Contrastive adversarial learning for person independent facial emotion recognition,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 7. AAAI, 2021, pp.5948–5956.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Wei Ai (48 papers)
  2. FuChen Zhang (5 papers)
  3. Tao Meng (48 papers)
  4. HongEn Shao (4 papers)
  5. Keqin Li (61 papers)
  6. Yuntao Shou (28 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.