Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Building a Role Specified Open-Domain Dialogue System Leveraging Large-Scale Language Models (2205.00176v1)

Published 30 Apr 2022 in cs.CL

Abstract: Recent open-domain dialogue models have brought numerous breakthroughs. However, building a chat system is not scalable since it often requires a considerable volume of human-human dialogue data, especially when enforcing features such as persona, style, or safety. In this work, we study the challenge of imposing roles on open-domain dialogue systems, with the goal of making the systems maintain consistent roles while conversing naturally with humans. To accomplish this, the system must satisfy a role specification that includes certain conditions on the stated features as well as a system policy on whether or not certain types of utterances are allowed. For this, we propose an efficient data collection framework leveraging in-context few-shot learning of large-scale LLMs for building role-satisfying dialogue dataset from scratch. We then compare various architectures for open-domain dialogue systems in terms of meeting role specifications while maintaining conversational abilities. Automatic and human evaluations show that our models return few out-of-bounds utterances, keeping competitive performance on general metrics. We release a Korean dialogue dataset we built for further research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Sanghwan Bae (10 papers)
  2. Donghyun Kwak (12 papers)
  3. Sungdong Kim (30 papers)
  4. Donghoon Ham (4 papers)
  5. Soyoung Kang (7 papers)
  6. Sang-Woo Lee (34 papers)
  7. Woomyoung Park (7 papers)
Citations (29)