Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

QASC: A Dataset for Question Answering via Sentence Composition (1910.11473v2)

Published 25 Oct 2019 in cs.CL

Abstract: Composing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition(QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice question. QASC is the first dataset to offer two desirable properties: (a) the facts to be composed are annotated in a large corpus, and (b) the decomposition into these facts is not evident from the question itself. The latter makes retrieval challenging as the system must introduce new concepts or relations in order to discover potential decompositions. Further, the reasoning model must then learn to identify valid compositions of these retrieved facts using common-sense reasoning. To help address these challenges, we provide annotation for supporting facts as well as their composition. Guided by these annotations, we present a two-step approach to mitigate the retrieval challenges. We use other multiple-choice datasets as additional training data to strengthen the reasoning model. Our proposed approach improves over current state-of-the-art LLMs by 11% (absolute). The reasoning and retrieval problems, however, remain unsolved as this model still lags by 20% behind human performance.

An Analytical Review of "QASC: A Dataset for Question Answering via Sentence Composition"

The paper "QASC: A Dataset for Question Answering via Sentence Composition" introduces a novel dataset designed to tackle multi-hop question-answering (QA) challenges, particularly focusing on the synthesis of knowledge from disparate text sources necessitated to answer multiple-choice questions. The authors aim to elevate existing multi-hop reasoning benchmarks by introducing the QASC (Question Answering via Sentence Composition) dataset. This dataset is characterized by its unique demands: it requires a system to not only retrieve relevant facts from a large corpus but also to compose these facts in a way that is not immediately apparent or suggested by the question context itself.

Key Characteristics of QASC

  1. Annotated Fact Composition: Notably, QASC explicitly annotates the pairs of sentences from which questions are derived. The compositional relationship between these sentences and the final composed fact pertinent to the answer is directly provided, facilitating the development and supervision of complex reasoning models.
  2. Implicit Decomposition Challenge: Unlike many datasets where the decomposition of a multi-hop question into its constituent parts is syntactically evident, QASC requires models to infer these decompositions in a less direct manner. This necessitates the introduction of new concepts or connections not immediately suggested by the question text alone, thereby challenging models to engage in greater levels of inference.
  3. Two-step Retrieval Approach: The authors propose a retrieval mechanism aimed at bolstering the systemic capability to navigate the annotated corpus for pertinent information. This two-step retrieval strategy is shown to significantly enhance the recall of support facts and thereby improve question-answering accuracy, demonstrating a 43-point improvement in the recall of gold facts and a 14-point increase in QA accuracy over simpler retrieval methodologies.
  4. Performance Metrics: The QASC dataset serves as a stringent test for current LLMs. Despite fine-tuning on extensive external data, the best-performing models lag 20% behind human performance metrics, underscoring the complexity and the rigor that QASC brings to the multi-hop QA landscape.

Implications for Research

The implications of QASC for AI research are multifaceted. The dataset addresses, and by extension challenges, the current paradigms in natural language understanding and reasoning. Systems developed with QASC have to efficiently navigate large-scale corpora to extract, synthesize, and reason with multiple interrelated facts. This pushes the boundary beyond single-fact retrieval and simple inference, marking a significant leap towards achieving nuanced and context-rich language comprehension.

Furthermore, QASC encourages the development of algorithms that can handle the implicit nature of multi-hop questions, which is often required in real-world information synthesis and decision-making scenarios. The dataset's explicit emphasis on knowledge composition denotes a move towards building systems capable of abstracting compositional rules, thus driving advances in generalized N-hop reasoning capabilities.

Future Directions

Moving forward, AI research grounded in datasets like QASC could likely delve into more sophisticated neural retrieval methods that blend symbolic reasoning with neural representations. Moreover, the knowledge composition task could inspire novel architectures that combine retrieval-centric models with generative models capable of inferential reasoning.

In summary, QASC is a critical contribution to the question-answering domain, setting a sophisticated benchmark that not only elevates the requirements for existing models but also opens new avenues for research into compositional intelligence and the synthesis of disparate information sources. The dataset presents a stepping stone for achieving robust and semantically rich AI capable of mirroring human-level reasoning in its capacity to inform, infer, and compose.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Tushar Khot (53 papers)
  2. Peter Clark (108 papers)
  3. Michal Guerquin (4 papers)
  4. Peter Jansen (22 papers)
  5. Ashish Sabharwal (84 papers)
Citations (297)