Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training (2203.09313v3)

Published 17 Mar 2022 in cs.CL and cs.AI

Abstract: Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investigate these under-explored factors, including data quality control, model architecture designs, training approaches, and decoding strategies. We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available. Automatic and human evaluations show that EVA2.0 significantly outperforms other open-source counterparts. We also discuss the limitations of this work by presenting some failure cases and pose some future research directions on large-scale Chinese open-domain dialogue systems.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Yuxian Gu (21 papers)
  2. Jiaxin Wen (16 papers)
  3. Hao Sun (383 papers)
  4. Yi Song (34 papers)
  5. Pei Ke (37 papers)
  6. Chujie Zheng (35 papers)
  7. Zheng Zhang (486 papers)
  8. Jianzhu Yao (4 papers)
  9. Lei Liu (332 papers)
  10. Xiaoyan Zhu (54 papers)
  11. Minlie Huang (225 papers)
Citations (49)