Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Semi-supervised Bootstrapping of Dialogue State Trackers for Task Oriented Modelling (1911.11672v1)

Published 26 Nov 2019 in cs.CL

Abstract: Dialogue systems benefit greatly from optimizing on detailed annotations, such as transcribed utterances, internal dialogue state representations and dialogue act labels. However, collecting these annotations is expensive and time-consuming, holding back development in the area of dialogue modelling. In this paper, we investigate semi-supervised learning methods that are able to reduce the amount of required intermediate labelling. We find that by leveraging un-annotated data instead, the amount of turn-level annotations of dialogue state can be significantly reduced when building a neural dialogue system. Our analysis on the MultiWOZ corpus, covering a range of domains and topics, finds that annotations can be reduced by up to 30\% while maintaining equivalent system performance. We also describe and evaluate the first end-to-end dialogue model created for the MultiWOZ corpus.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Bo-Hsiang Tseng (20 papers)
  2. Marek Rei (52 papers)
  3. Paweł Budzianowski (27 papers)
  4. Richard E. Turner (112 papers)
  5. Bill Byrne (57 papers)
  6. Anna Korhonen (90 papers)
Citations (4)