Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DialogueLLM: Context and Emotion Knowledge-Tuned Large Language Models for Emotion Recognition in Conversations (2310.11374v4)

Published 17 Oct 2023 in cs.CL

Abstract: LLMs and their variants have shown extraordinary efficacy across numerous downstream NLP tasks, which has presented a new vision for the development of NLP. Despite their remarkable performance in natural language generating (NLG), LLMs lack a distinct focus on the emotion understanding domain. As a result, using LLMs for emotion recognition may lead to suboptimal and inadequate precision. Another limitation of LLMs is that they are typical trained without leveraging multi-modal information. To overcome these limitations, we propose DialogueLLM, a context and emotion knowledge tuned LLM that is obtained by fine-tuning LLaMA models with 13,638 multi-modal (i.e., texts and videos) emotional dialogues. The visual information is considered as the supplementary knowledge to construct high-quality instructions. We offer a comprehensive evaluation of our proposed model on three benchmarking emotion recognition in conversations (ERC) datasets and compare the results against the SOTA baselines and other SOTA LLMs. Additionally, DialogueLLM-7B can be easily trained using LoRA on a 40GB A100 GPU in 5 hours, facilitating reproducibility for other researchers.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. Iemocap: Interactive emotional dyadic motion capture database. Language resources and evaluation, 42:335–359.
  2. Phoenix: Democratizing chatgpt across languages. arXiv preprint arXiv:2304.10453.
  3. M2fnet: Multi-modal fusion network for emotion recognition in conversation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4652–4661.
  4. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  5. Dialoguegcn: A graph convolutional neural network for emotion recognition in conversation. arXiv preprint arXiv:1908.11540.
  6. Dr. llama: Improving small language models in domain-specific qa via generative data augmentation. arXiv preprint arXiv:2305.07804.
  7. Icon: Interactive conversational memory network for multimodal emotion detection. In Proceedings of the 2018 conference on empirical methods in natural language processing, pages 2594–2604.
  8. Conversational memory network for emotion recognition in dyadic dialogue videos. In Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting, volume 2018, page 2122. NIH Public Access.
  9. Supervised adversarial contrastive learning for emotion recognition in conversations. arXiv preprint arXiv:2306.01505.
  10. Dialoguecrn: Contextual reasoning networks for emotion recognition in conversations. arXiv preprint arXiv:2106.01978.
  11. Relation-aware graph attention networks with relational position encodings for emotion recognition in conversations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7360–7370.
  12. Instructerc: Reforming emotion recognition in conversation with a retrieval multi-task llms framework. CoRR, abs/2309.11911.
  13. Large language models understand and can be enhanced by emotional stimuli.
  14. Graphcfc: A directed graph based cross-modal feature complementation approach for multimodal conversational emotion recognition. CoRR, abs/2207.12261.
  15. Multi-task learning with auxiliary speaker identification for conversational emotion recognition. ArXiv, abs/2003.01478.
  16. Contrast and generation make bart a good dialogue emotion recognizer. In Proceedings of the AAAI conference on artificial intelligence, volume 36, pages 11002–11010.
  17. Chatdoctor: A medical chat model fine-tuned on a large language model meta-ai (llama) using medical domain knowledge. Cureus, 15(6).
  18. Tiedong Liu and Bryan Kian Hsiang Low. 2023. Goat: Fine-tuned llama outperforms gpt-4 on arithmetic tasks. arXiv preprint arXiv:2305.14201.
  19. A quantum probability driven framework for joint multi-modal sarcasm, sentiment and emotion analysis. IEEE Transactions on Affective Computing, pages 1–15.
  20. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  21. Moving from narrative to interactive multi-modal sentiment analysis: A survey. ACM Transactions on Asian and Low-Resource Language Information Processing.
  22. Dialoguernn: An attentive rnn for emotion detection in conversations. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 6818–6825.
  23. OpenAI. 2023. Gpt-4 technical report.
  24. The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only.
  25. Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277.
  26. Context-dependent sentiment analysis in user-generated videos. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: Long papers), pages 873–883.
  27. Meld: A multimodal multi-party dataset for emotion recognition in conversations. arXiv preprint arXiv:1810.02508.
  28. Multimodal learning using optimal transport for sarcasm and humor detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3930–3940.
  29. Mutual-enhanced incongruity learning network for multi-modal sarcasm detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 9507–9515.
  30. Improving language understanding by generative pre-training.
  31. Pangu-{{\{{\\\backslash\Sigma}}\}}: Towards trillion parameter language model with sparse heterogeneous computing. arXiv preprint arXiv:2303.10845.
  32. Dialogxl: All-in-one xlnet for multi-party conversation emotion recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13789–13797.
  33. Directed acyclic graph network for conversational emotion recognition. arXiv preprint arXiv:2105.12907.
  34. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  35. Llama 2: Open foundation and fine-tuned chat models.
  36. Huatuo: Tuning llama model with chinese medical knowledge. arXiv preprint arXiv:2304.06975.
  37. Emotional intelligence of large language models. CoRR, abs/2307.09042.
  38. Pmc-llama: Further finetuning llama on medical papers. arXiv preprint arXiv:2304.14454.
  39. Sayyed M Zahiri and Jinho D Choi. 2017. Emotion detection on tv show transcripts with sequence-based convolutional neural networks. arXiv preprint arXiv:1708.04299.
  40. Haidong Zhang and Yekun Chai. 2021. Coin: Conversational interactive networks for emotion recognition in conversation. In Proceedings of the Third Workshop on Multimodal Artificial Intelligence, pages 12–18.
  41. Sentiment analysis in the era of large language models: A reality check. arXiv preprint arXiv:2305.15005.
  42. M3gat: A multi-modal multi-task interactive graph attention network for conversational sentiment analysis and emotion recognition. ACM Transactions on Information Systems.
  43. Quantum-inspired interactive networks for conversational sentiment analysis. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI-19, pages 5436–5442. International Joint Conferences on Artificial Intelligence Organization.
  44. Cfn: a complex-valued fuzzy network for sarcasm detection in conversations. IEEE Transactions on Fuzzy Systems, 29(12):3696–3710.
  45. A multitask learning model for multimodal sarcasm, sentiment and emotion recognition in conversations. Information Fusion, 93:282–301.
  46. Learning multi-task commonness and uniqueness for multi-modal sarcasm detection and sentiment analysis in conversation. IEEE Transactions on Artificial Intelligence.
  47. A survey of large language models. arXiv preprint arXiv:2303.18223.
  48. Cauain: Causal aware interaction network for emotion recognition in conversations. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI, pages 4524–4530.
  49. Topic-driven and knowledge-aware transformer for dialogue emotion detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1571–1582, Online. Association for Computational Linguistics.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yazhou Zhang (24 papers)
  2. Mengyao Wang (13 papers)
  3. Youxi Wu (16 papers)
  4. Prayag Tiwari (41 papers)
  5. Qiuchi Li (25 papers)
  6. Benyou Wang (109 papers)
  7. Jing Qin (145 papers)
Citations (14)