Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

"How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken Conversations (2109.13489v1)

Published 28 Sep 2021 in cs.CL

Abstract: Most prior work in dialogue modeling has been on written conversations mostly because of existing data sets. However, written dialogues are not sufficient to fully capture the nature of spoken conversations as well as the potential speech recognition errors in practical spoken dialogue systems. This work presents a new benchmark on spoken task-oriented conversations, which is intended to study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling. We report that the existing state-of-the-art models trained on written conversations are not performing well on our spoken data, as expected. Furthermore, we observe improvements in task performances when leveraging n-best speech recognition hypotheses such as by combining predictions based on individual hypotheses. Our data set enables speech-based benchmarking of task-oriented dialogue systems.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Seokhwan Kim (29 papers)
  2. Yang Liu (2253 papers)
  3. Di Jin (104 papers)
  4. Alexandros Papangelis (23 papers)
  5. Karthik Gopalakrishnan (34 papers)
  6. Behnam Hedayatnia (27 papers)
  7. Dilek Hakkani-Tur (94 papers)
Citations (38)