Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation (2204.13031v2)

Published 27 Apr 2022 in cs.CL

Abstract: Dialog response generation in open domain is an important research topic where the main challenge is to generate relevant and diverse responses. In this paper, we propose a new dialog pre-training framework called DialogVED, which introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses. With the help of a large dialog corpus (Reddit), we pre-train the model using the following 4 tasks adopted in LLMs (LMs) and variational autoencoders (VAEs): 1) masked LLM; 2) response generation; 3) bag-of-words prediction; and 4) KL divergence reduction. We also add additional parameters to model the turn structure in dialogs to improve the performance of the pre-trained model. We conduct experiments on PersonaChat, DailyDialog, and DSTC7-AVSD benchmarks for response generation. Experimental results show that our model achieves the new state-of-the-art results on all these datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Wei Chen (1288 papers)
  2. Yeyun Gong (78 papers)
  3. Song Wang (313 papers)
  4. Bolun Yao (4 papers)
  5. Weizhen Qi (15 papers)
  6. Zhongyu Wei (98 papers)
  7. Xiaowu Hu (2 papers)
  8. Bartuer Zhou (4 papers)
  9. Yi Mao (78 papers)
  10. Weizhu Chen (128 papers)
  11. Biao Cheng (4 papers)
  12. Nan Duan (172 papers)
Citations (47)