Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

OLISIA: a Cascade System for Spoken Dialogue State Tracking (2304.11073v3)

Published 20 Apr 2023 in eess.AS, cs.AI, cs.CL, and cs.SD

Abstract: Though Dialogue State Tracking (DST) is a core component of spoken dialogue systems, recent work on this task mostly deals with chat corpora, disregarding the discrepancies between spoken and written language.In this paper, we propose OLISIA, a cascade system which integrates an Automatic Speech Recognition (ASR) model and a DST model. We introduce several adaptations in the ASR and DST modules to improve integration and robustness to spoken conversations.With these adaptations, our system ranked first in DSTC11 Track 3, a benchmark to evaluate spoken DST. We conduct an in-depth analysis of the results and find that normalizing the ASR outputs and adapting the DST inputs through data augmentation, along with increasing the pre-trained models size all play an important role in reducing the performance discrepancy between written and spoken conversations.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Léo Jacqmin (3 papers)
  2. Lucas Druart (4 papers)
  3. Yannick Estève (45 papers)
  4. Lina Maria Rojas-Barahona (7 papers)
  5. Valentin Vielzeuf (17 papers)
  6. Benoît Favre (3 papers)
Citations (3)