Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-DC: When to retrieve and When to generate? Self Divide-and-Conquer for Compositional Unknown Questions (2402.13514v1)

Published 21 Feb 2024 in cs.CL and cs.AI

Abstract: Retrieve-then-read and generate-then-read are two typical solutions to handle unknown and known questions in open-domain question-answering, while the former retrieves necessary external knowledge and the later prompt the LLMs to generate internal known knowledge encoded in the parameters. However, few of previous works consider the compositional unknown questions, which consist of several known or unknown sub-questions. Thus, simple binary classification (known or unknown) becomes sub-optimal and inefficient since it will call external retrieval excessively for each compositional unknown question. To this end, we propose the first Compositional unknown Question-Answering dataset (CuQA), and introduce a Self Divide-and-Conquer (Self-DC) framework to empower LLMs to adaptively call different methods on-demand, resulting in better performance and efficiency. Experimental results on two datasets (CuQA and FreshQA) demonstrate that Self-DC can achieve comparable or even better performance with much more less retrieval times compared with several strong baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Hongru Wang (62 papers)
  2. Boyang Xue (23 papers)
  3. Baohang Zhou (5 papers)
  4. Tianhua Zhang (10 papers)
  5. Cunxiang Wang (30 papers)
  6. Guanhua Chen (71 papers)
  7. Huimin Wang (24 papers)
  8. Kam-Fai Wong (92 papers)
Citations (15)