Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations (2112.12441v1)

Published 23 Dec 2021 in cs.CL

Abstract: Task-oriented dialogue systems have been plagued by the difficulties of obtaining large-scale and high-quality annotated conversations. Furthermore, most of the publicly available datasets only include written conversations, which are insufficient to reflect actual human behaviors in practical spoken dialogue systems. In this paper, we propose Task-oriented Dialogue Data Augmentation (TOD-DA), a novel model-agnostic data augmentation paradigm to boost the robustness of task-oriented dialogue modeling on spoken conversations. The TOD-DA consists of two modules: 1) Dialogue Enrichment to expand training data on task-oriented conversations for easing data sparsity and 2) Spoken Conversation Simulator to imitate oral style expressions and speech recognition errors in diverse granularities for bridging the gap between written and spoken conversations. With such designs, our approach ranked first in both tasks of DSTC10 Track2, a benchmark for task-oriented dialogue modeling on spoken conversations, demonstrating the superiority and effectiveness of our proposed TOD-DA.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (14)
  1. Xin Tian (39 papers)
  2. Xinxian Huang (2 papers)
  3. Dongfeng He (1 paper)
  4. Yingzhan Lin (6 papers)
  5. Siqi Bao (21 papers)
  6. Huang He (14 papers)
  7. Liankai Huang (3 papers)
  8. Qiang Ju (5 papers)
  9. Xiyuan Zhang (31 papers)
  10. Jian Xie (39 papers)
  11. Shuqi Sun (5 papers)
  12. Fan Wang (312 papers)
  13. Hua Wu (191 papers)
  14. Haifeng Wang (194 papers)
Citations (17)