Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models (1910.03756v3)

Published 9 Oct 2019 in cs.CL and cs.AI

Abstract: Existing dialog system models require extensive human annotations and are difficult to generalize to different tasks. The recent success of large pre-trained LLMs such as BERT and GPT-2 (Devlin et al., 2019; Radford et al., 2019) have suggested the effectiveness of incorporating language priors in down-stream NLP tasks. However, how much pre-trained LLMs can help dialog response generation is still under exploration. In this paper, we propose a simple, general, and effective framework: Alternating Roles Dialog Model (ARDM). ARDM models each speaker separately and takes advantage of the large pre-trained LLM. It requires no supervision from human annotations such as belief states or dialog acts to achieve effective conversations. ARDM outperforms or is on par with state-of-the-art methods on two popular task-oriented dialog datasets: CamRest676 and MultiWOZ. Moreover, we can generalize ARDM to more challenging, non-collaborative tasks such as persuasion. In persuasion tasks, ARDM is capable of generating human-like responses to persuade people to donate to a charity.

PDF Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

Authors (4)

Qingyang Wu (29 papers)
Yichi Zhang (184 papers)
Yu Li (377 papers)
Zhou Yu (206 papers)

Citations (61)

View on Semantic Scholar

Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models (1910.03756v3)

Related Papers