DialogueLLM: Context and Emotion Knowledge-Tuned Large Language Models for Emotion Recognition in Conversations (2310.11374v4)
Abstract: LLMs and their variants have shown extraordinary efficacy across numerous downstream NLP tasks, which has presented a new vision for the development of NLP. Despite their remarkable performance in natural language generating (NLG), LLMs lack a distinct focus on the emotion understanding domain. As a result, using LLMs for emotion recognition may lead to suboptimal and inadequate precision. Another limitation of LLMs is that they are typical trained without leveraging multi-modal information. To overcome these limitations, we propose DialogueLLM, a context and emotion knowledge tuned LLM that is obtained by fine-tuning LLaMA models with 13,638 multi-modal (i.e., texts and videos) emotional dialogues. The visual information is considered as the supplementary knowledge to construct high-quality instructions. We offer a comprehensive evaluation of our proposed model on three benchmarking emotion recognition in conversations (ERC) datasets and compare the results against the SOTA baselines and other SOTA LLMs. Additionally, DialogueLLM-7B can be easily trained using LoRA on a 40GB A100 GPU in 5 hours, facilitating reproducibility for other researchers.
- Iemocap: Interactive emotional dyadic motion capture database. Language resources and evaluation, 42:335–359.
- Phoenix: Democratizing chatgpt across languages. arXiv preprint arXiv:2304.10453.
- M2fnet: Multi-modal fusion network for emotion recognition in conversation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4652–4661.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Dialoguegcn: A graph convolutional neural network for emotion recognition in conversation. arXiv preprint arXiv:1908.11540.
- Dr. llama: Improving small language models in domain-specific qa via generative data augmentation. arXiv preprint arXiv:2305.07804.
- Icon: Interactive conversational memory network for multimodal emotion detection. In Proceedings of the 2018 conference on empirical methods in natural language processing, pages 2594–2604.
- Conversational memory network for emotion recognition in dyadic dialogue videos. In Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting, volume 2018, page 2122. NIH Public Access.
- Supervised adversarial contrastive learning for emotion recognition in conversations. arXiv preprint arXiv:2306.01505.
- Dialoguecrn: Contextual reasoning networks for emotion recognition in conversations. arXiv preprint arXiv:2106.01978.
- Relation-aware graph attention networks with relational position encodings for emotion recognition in conversations. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 7360–7370.
- Instructerc: Reforming emotion recognition in conversation with a retrieval multi-task llms framework. CoRR, abs/2309.11911.
- Large language models understand and can be enhanced by emotional stimuli.
- Graphcfc: A directed graph based cross-modal feature complementation approach for multimodal conversational emotion recognition. CoRR, abs/2207.12261.
- Multi-task learning with auxiliary speaker identification for conversational emotion recognition. ArXiv, abs/2003.01478.
- Contrast and generation make bart a good dialogue emotion recognizer. In Proceedings of the AAAI conference on artificial intelligence, volume 36, pages 11002–11010.
- Chatdoctor: A medical chat model fine-tuned on a large language model meta-ai (llama) using medical domain knowledge. Cureus, 15(6).
- Tiedong Liu and Bryan Kian Hsiang Low. 2023. Goat: Fine-tuned llama outperforms gpt-4 on arithmetic tasks. arXiv preprint arXiv:2305.14201.
- A quantum probability driven framework for joint multi-modal sarcasm, sentiment and emotion analysis. IEEE Transactions on Affective Computing, pages 1–15.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
- Moving from narrative to interactive multi-modal sentiment analysis: A survey. ACM Transactions on Asian and Low-Resource Language Information Processing.
- Dialoguernn: An attentive rnn for emotion detection in conversations. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 6818–6825.
- OpenAI. 2023. Gpt-4 technical report.
- The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only.
- Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277.
- Context-dependent sentiment analysis in user-generated videos. In Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: Long papers), pages 873–883.
- Meld: A multimodal multi-party dataset for emotion recognition in conversations. arXiv preprint arXiv:1810.02508.
- Multimodal learning using optimal transport for sarcasm and humor detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3930–3940.
- Mutual-enhanced incongruity learning network for multi-modal sarcasm detection. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 9507–9515.
- Improving language understanding by generative pre-training.
- Pangu-{{\{{\\\backslash\Sigma}}\}}: Towards trillion parameter language model with sparse heterogeneous computing. arXiv preprint arXiv:2303.10845.
- Dialogxl: All-in-one xlnet for multi-party conversation emotion recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 13789–13797.
- Directed acyclic graph network for conversational emotion recognition. arXiv preprint arXiv:2105.12907.
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
- Llama 2: Open foundation and fine-tuned chat models.
- Huatuo: Tuning llama model with chinese medical knowledge. arXiv preprint arXiv:2304.06975.
- Emotional intelligence of large language models. CoRR, abs/2307.09042.
- Pmc-llama: Further finetuning llama on medical papers. arXiv preprint arXiv:2304.14454.
- Sayyed M Zahiri and Jinho D Choi. 2017. Emotion detection on tv show transcripts with sequence-based convolutional neural networks. arXiv preprint arXiv:1708.04299.
- Haidong Zhang and Yekun Chai. 2021. Coin: Conversational interactive networks for emotion recognition in conversation. In Proceedings of the Third Workshop on Multimodal Artificial Intelligence, pages 12–18.
- Sentiment analysis in the era of large language models: A reality check. arXiv preprint arXiv:2305.15005.
- M3gat: A multi-modal multi-task interactive graph attention network for conversational sentiment analysis and emotion recognition. ACM Transactions on Information Systems.
- Quantum-inspired interactive networks for conversational sentiment analysis. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence IJCAI-19, pages 5436–5442. International Joint Conferences on Artificial Intelligence Organization.
- Cfn: a complex-valued fuzzy network for sarcasm detection in conversations. IEEE Transactions on Fuzzy Systems, 29(12):3696–3710.
- A multitask learning model for multimodal sarcasm, sentiment and emotion recognition in conversations. Information Fusion, 93:282–301.
- Learning multi-task commonness and uniqueness for multi-modal sarcasm detection and sentiment analysis in conversation. IEEE Transactions on Artificial Intelligence.
- A survey of large language models. arXiv preprint arXiv:2303.18223.
- Cauain: Causal aware interaction network for emotion recognition in conversations. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI, pages 4524–4530.
- Topic-driven and knowledge-aware transformer for dialogue emotion detection. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1571–1582, Online. Association for Computational Linguistics.
- Yazhou Zhang (24 papers)
- Mengyao Wang (13 papers)
- Youxi Wu (16 papers)
- Prayag Tiwari (41 papers)
- Qiuchi Li (25 papers)
- Benyou Wang (109 papers)
- Jing Qin (145 papers)