Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Data-Efficient Methods for Dialogue Systems (2012.02929v1)

Published 5 Dec 2020 in cs.CL

Abstract: Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa or business-oriented solutions. Deep learning underlies many recent breakthroughs in dialogue systems but requires very large amounts of training data, often annotated by experts. Trained with smaller data, these methods end up severely lacking robustness (e.g. to disfluencies and out-of-domain input), and often just have too little generalisation power. In this thesis, we address the above issues by introducing a series of methods for training robust dialogue systems from minimal data. Firstly, we study two orthogonal approaches to dialogue: linguistically informed and machine learning-based - from the data efficiency perspective. We outline the steps to obtain data-efficient solutions with either approach. We then introduce two data-efficient models for dialogue response generation: the Dialogue Knowledge Transfer Network based on latent variable dialogue representations, and the hybrid Generative-Retrieval Transformer model (ranked first at the DSTC 8 Fast Domain Adaptation task). Next, we address the problem of robustness given minimal data. As such, propose a multitask LSTM-based model for domain-general disfluency detection. For the problem of out-of-domain input, we present Turn Dropout, a data augmentation technique for anomaly detection only using in-domain data, and introduce autoencoder-augmented models for efficient training with Turn Dropout. Finally, we focus on social dialogue and introduce a neural model for response ranking in social conversation used in Alana, the 3rd place winner in the Amazon Alexa Prize 2017 and 2018. We employ a novel technique of predicting the dialogue length as the main ranking objective and show that this approach improves upon the ratings-based counterpart in terms of data efficiency while matching it in performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (1)
  1. Igor Shalyminov (20 papers)