Self-DC: When to retrieve and When to generate? Self Divide-and-Conquer for Compositional Unknown Questions (2402.13514v1)

Published 21 Feb 2024 in cs.CL and cs.AI

Abstract: Retrieve-then-read and generate-then-read are two typical solutions to handle unknown and known questions in open-domain question-answering, while the former retrieves necessary external knowledge and the later prompt the LLMs to generate internal known knowledge encoded in the parameters. However, few of previous works consider the compositional unknown questions, which consist of several known or unknown sub-questions. Thus, simple binary classification (known or unknown) becomes sub-optimal and inefficient since it will call external retrieval excessively for each compositional unknown question. To this end, we propose the first Compositional unknown Question-Answering dataset (CuQA), and introduce a Self Divide-and-Conquer (Self-DC) framework to empower LLMs to adaptively call different methods on-demand, resulting in better performance and efficiency. Experimental results on two datasets (CuQA and FreshQA) demonstrate that Self-DC can achieve comparable or even better performance with much more less retrieval times compared with several strong baselines.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

Authors (8)

Hongru Wang (62 papers)
Boyang Xue (23 papers)
Baohang Zhou (5 papers)
Tianhua Zhang (10 papers)
Cunxiang Wang (30 papers)
Guanhua Chen (71 papers)
Huimin Wang (24 papers)
Kam-Fai Wong (92 papers)

Citations (15)

View on Semantic Scholar

Tweets

https://twitter.com/WangCarrey/status/1898252276306792745

https://twitter.com/WangCarrey/status/1767052916563878255

https://twitter.com/WangCarrey/status/1763945174332572029

https://twitter.com/_reachsumit/status/1760546682327245222

Self-DC: When to retrieve and When to generate? Self Divide-and-Conquer for Compositional Unknown Questions (2402.13514v1)

Related Papers

Tweets