Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Alternating Recurrent Dialog Model with Large-scale Pre-trained Language Models (1910.03756v3)

Published 9 Oct 2019 in cs.CL and cs.AI

Abstract: Existing dialog system models require extensive human annotations and are difficult to generalize to different tasks. The recent success of large pre-trained LLMs such as BERT and GPT-2 (Devlin et al., 2019; Radford et al., 2019) have suggested the effectiveness of incorporating language priors in down-stream NLP tasks. However, how much pre-trained LLMs can help dialog response generation is still under exploration. In this paper, we propose a simple, general, and effective framework: Alternating Roles Dialog Model (ARDM). ARDM models each speaker separately and takes advantage of the large pre-trained LLM. It requires no supervision from human annotations such as belief states or dialog acts to achieve effective conversations. ARDM outperforms or is on par with state-of-the-art methods on two popular task-oriented dialog datasets: CamRest676 and MultiWOZ. Moreover, we can generalize ARDM to more challenging, non-collaborative tasks such as persuasion. In persuasion tasks, ARDM is capable of generating human-like responses to persuade people to donate to a charity.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Qingyang Wu (29 papers)
  2. Yichi Zhang (184 papers)
  3. Yu Li (377 papers)
  4. Zhou Yu (206 papers)
Citations (61)