Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Is one brick enough to break the wall of spoken dialogue state tracking? (2311.04923v3)

Published 3 Nov 2023 in cs.CL, cs.AI, eess.AS, and eess.SP

Abstract: In Task-Oriented Dialogue (TOD) systems, correctly updating the system's understanding of the user's requests (\textit{a.k.a} dialogue state tracking) is key to a smooth interaction. Traditionally, TOD systems perform this update in three steps: transcription of the user's utterance, semantic extraction of the key concepts, and contextualization with the previously identified concepts. Such cascade approaches suffer from cascading errors and separate optimization. End-to-End approaches have been proven helpful up to the turn-level semantic extraction step. This paper goes one step further and provides (1) a novel approach for completely neural spoken DST, (2) an in depth comparison with a state of the art cascade approach and (3) avenues towards better context propagation. Our study highlights that jointly-optimized approaches are also competitive for contextually dependent tasks, such as Dialogue State Tracking (DST), especially in audio native settings. Context propagation in DST systems could benefit from training procedures accounting for the previous' context inherent uncertainty.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. SLU in Commercial and Research Spoken Dialogue Systems, Wiley Telecom, 2011.
  2. “The Dialog State Tracking Challenge Series: A Review,” Dialogue & Discourse, 2016.
  3. “Impact Analysis of the Use of Speech and Language Models Pretrained by Self-Supersivion for Spoken Language Understanding,” in Conference on Language Resources and Evaluation (LREC), 2022.
  4. “Towards end-to-end spoken language understanding,” in ICASSP, 2018.
  5. “OLISIA: a Cascade System for Spoken Dialogue State Tracking,” in Special Interest Group on Discourse and Dialogue (SIGDIAL), 2023.
  6. “A study on the integration of pipeline and e2e slu systems for spoken semantic parsing toward stop quality challenge,” in ICASSP, 2023.
  7. “Towards end-to-end integration of dialog history for improved spoken language understanding,” in ICASSP, 2022.
  8. “Integrating Dialog History into End-to-End Spoken Language Understanding Systems,” in Interspeech, 2021.
  9. “End-to-end spoken language understanding with tree-constrained pointer generator,” in ICASSP, 2023.
  10. “End2end acoustic to semantic transduction,” in ICASSP, 2021.
  11. “Dialogue history integration into end-to-end signal-to-concept spoken language understanding systems,” in ICASSP, 2020.
  12. “ConvKT: Conversation-Level Knowledge Transfer for Context Aware End-to-End Spoken Language Understanding,” in Interspeech, 2023.
  13. “Dialogue meaning representation for task-oriented dialogue systems,” in Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022.
  14. “Multiwoz - a large-scale multi-domain wizard-of-oz dataset for task-oriented dialogue modelling,” in Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018.
  15. “Speech aware dialog system technology challenge (dstc11),” in Special Interest Group on Discourse and Dialogue (SIGDIAL), 2023.
  16. “Wavlm: Large-scale self-supervised pre-training for full stack speech processing,” IEEE Journal of Selected Topics in Signal Processing (JSTSP), 2021.
  17. “Exploring the limits of transfer learning with a unified text-to-text transformer,” Journal Machine Learning Research (JMLR), 2020.
  18. “Robust speech recognition via large-scale weak supervision,” ArXiv, vol. abs/2212.04356, 2022.
  19. “SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents,” in NeurIPS Datasets and Benchmarks Track, 2023.
  20. “Global-locally self-attentive encoder for dialogue state tracking,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Lucas Druart (4 papers)
  2. Valentin Vielzeuf (17 papers)
  3. Yannick Estève (45 papers)

Summary

We haven't generated a summary for this paper yet.