Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Integrating Dialog History into End-to-End Spoken Language Understanding Systems (2108.08405v1)

Published 18 Aug 2021 in cs.CL, cs.SD, and eess.AS

Abstract: End-to-end spoken language understanding (SLU) systems that process human-human or human-computer interactions are often context independent and process each turn of a conversation independently. Spoken conversations on the other hand, are very much context dependent, and dialog history contains useful information that can improve the processing of each conversational turn. In this paper, we investigate the importance of dialog history and how it can be effectively integrated into end-to-end SLU systems. While processing a spoken utterance, our proposed RNN transducer (RNN-T) based SLU model has access to its dialog history in the form of decoded transcripts and SLU labels of previous turns. We encode the dialog history as BERT embeddings, and use them as an additional input to the SLU model along with the speech features for the current utterance. We evaluate our approach on a recently released spoken dialog data set, the HarperValleyBank corpus. We observe significant improvements: 8% for dialog action and 30% for caller intent recognition tasks, in comparison to a competitive context independent end-to-end baseline system.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Jatin Ganhotra (16 papers)
  2. Samuel Thomas (42 papers)
  3. Hong-Kwang J. Kuo (11 papers)
  4. Sachindra Joshi (32 papers)
  5. George Saon (39 papers)
  6. Brian Kingsbury (54 papers)
  7. Zoltán Tüske (7 papers)
Citations (10)