Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

S2M: Converting Single-Turn to Multi-Turn Datasets for Conversational Question Answering (2312.16511v1)

Published 27 Dec 2023 in cs.CL and cs.AI

Abstract: Supplying data augmentation to conversational question answering (CQA) can effectively improve model performance. However, there is less improvement from single-turn datasets in CQA due to the distribution gap between single-turn and multi-turn datasets. On the other hand, while numerous single-turn datasets are available, we have not utilized them effectively. To solve this problem, we propose a novel method to convert single-turn datasets to multi-turn datasets. The proposed method consists of three parts, namely, a QA pair Generator, a QA pair Reassembler, and a question Rewriter. Given a sample consisting of context and single-turn QA pairs, the Generator obtains candidate QA pairs and a knowledge graph based on the context. The Reassembler utilizes the knowledge graph to get sequential QA pairs, and the Rewriter rewrites questions from a conversational perspective to obtain a multi-turn dataset S2M. Our experiments show that our method can synthesize effective training resources for CQA. Notably, S2M ranks 1st place on the QuAC leaderboard at the time of submission (Aug 24th, 2022).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Baokui Li (2 papers)
  2. Sen Zhang (86 papers)
  3. Wangshu Zhang (3 papers)
  4. Yicheng Chen (24 papers)
  5. Changlin Yang (9 papers)
  6. Sen Hu (32 papers)
  7. Teng Xu (21 papers)
  8. Siye liu (2 papers)
  9. Jiwei Li (137 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.