Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

E-chat: Emotion-sensitive Spoken Dialogue System with Large Language Models (2401.00475v3)

Published 31 Dec 2023 in cs.SD and eess.AS

Abstract: This study focuses on emotion-sensitive spoken dialogue in human-machine speech interaction. With the advancement of LLMs, dialogue systems can handle multimodal data, including audio. Recent models have enhanced the understanding of complex audio signals through the integration of various audio events. However, they are unable to generate appropriate responses based on emotional speech. To address this, we introduce the Emotional chat Model (E-chat), a novel spoken dialogue system capable of comprehending and responding to emotions conveyed from speech. This model leverages an emotion embedding extracted by a speech encoder, combined with LLMs, enabling it to respond according to different emotional contexts. Additionally, we introduce the E-chat200 dataset, designed explicitly for emotion-sensitive spoken dialogue. In various evaluation metrics, E-chat consistently outperforms baseline model, demonstrating its potential in emotional comprehension and human-machine interaction.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Hongfei Xue (22 papers)
  2. Yuhao Liang (10 papers)
  3. Bingshen Mu (8 papers)
  4. Shiliang Zhang (132 papers)
  5. Mengzhe Chen (6 papers)
  6. Qian Chen (264 papers)
  7. Lei Xie (337 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.