Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Reinforcement Learning of Multi-Domain Dialog Policies Via Action Embeddings (2207.00468v1)

Published 1 Jul 2022 in cs.CL and cs.LG

Abstract: Learning task-oriented dialog policies via reinforcement learning typically requires large amounts of interaction with users, which in practice renders such methods unusable for real-world applications. In order to reduce the data requirements, we propose to leverage data from across different dialog domains, thereby reducing the amount of data required from each given domain. In particular, we propose to learn domain-agnostic action embeddings, which capture general-purpose structure that informs the system how to act given the current dialog context, and are then specialized to a specific domain. We show how this approach is capable of learning with significantly less interaction with users, with a reduction of 35% in the number of dialogs required to learn, and to a higher level of proficiency than training separate policies for each domain on a set of simulated domains.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jorge A. Mendez (11 papers)
  2. Alborz Geramifard (22 papers)
  3. Mohammad Ghavamzadeh (97 papers)
  4. Bing Liu (211 papers)
Citations (6)