Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests (2406.07794v2)

Published 12 Jun 2024 in cs.CL and cs.AI

Abstract: Indirect User Requests (IURs), such as "It's cold in here" instead of "Could you please increase the temperature?" are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener. While LLMs can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints. Moreover, existing task-oriented dialogue benchmarks lack sufficient examples of complex discourse phenomena such as indirectness. To address this, we propose a set of linguistic criteria along with an LLM-based pipeline for generating realistic IURs to test natural language understanding (NLU) and dialogue state tracking (DST) models before deployment in a new domain. We also release IndirectRequests, a dataset of IURs based on the Schema Guided Dialog (SGD) corpus, as a comparative testbed for evaluating the performance of smaller models in handling indirect requests.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Amogh Mannekote (6 papers)
  2. Jinseok Nam (5 papers)
  3. Ziming Li (44 papers)
  4. Jian Gao (119 papers)
  5. Kristy Elizabeth Boyer (7 papers)
  6. Bonnie J. Dorr (20 papers)
Citations (1)