The NarrativeQA Reading Comprehension Challenge (1712.07040v1)

Published 19 Dec 2017 in cs.CL, cs.AI, and cs.NE

Abstract: Reading comprehension (RC)---in contrast to information retrieval---requires integrating information and reasoning about events, entities, and their relations across a full document. Question answering is conventionally used to assess RC ability, in both artificial agents and children learning to read. However, existing RC datasets and tasks are dominated by questions that can be solved by selecting answers using superficial information (e.g., local context similarity or global term frequency); they thus fail to test for the essential integrative aspect of RC. To encourage progress on deeper comprehension of language, we present a new dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts. These tasks are designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience. We show that although humans solve the tasks easily, standard RC models struggle on the tasks presented here. We provide an analysis of the dataset and the challenges it presents.

PDF Abstract

The NarrativeQA Reading Comprehension Challenge

The paper "The NarrativeQA Reading Comprehension Challenge" introduces a new dataset aimed at advancing the field of natural language understanding, specifically targeting the deeper comprehension of narratives such as books and movie scripts. Unlike prior datasets, which often depend on localized context, NarrativeQA requires models to synthesize information distributed across lengthy documents.

Overview of the Dataset

NarrativeQA consists of 1,572 stories drawn from books and movie scripts, accompanied by human-written summaries, questions, and answers. These questions are designed to necessitate a comprehension of high-level narrative structures and relationships, emphasizing the limitations of existing models that primarily rely on shallow pattern matching techniques. The dataset presents challenging tasks since standard RC models must integrate information across an entire narrative to provide accurate answers.

Key Features

The dataset has notable features distinguishing it from existing datasets:

Document Length and Complexity: The stories in NarrativeQA contain complex plot structures and richly detailed settings, demanding an integrative understanding beyond sentence-level comprehension.
Nature of Questions and Answers: Questions are crafted based on summaries rather than full texts, requiring models to extrapolate and integrate disparate narrative elements in the story.
Task Variability: Tasks involve both reading comprehension from summaries and full texts, requiring models to generate free-form answers or select them from multiple choices—posing a diverse range of challenges.
Scalability and Applicability: The dataset anticipates the expansion of neural architectures that can effectively address the scaling issues related to processing long documents.

Evaluation and Challenges

Human performance on the dataset surpasses existing models by a significant margin, indicating both the complexity and potential of NarrativeQA as a tool for advancing research in deep language comprehension. Baseline models, such as sequence-to-sequence networks and variants of attention-based readers, show limited success when applied to full narratives, outlining the gap between computational models and human-like understanding.

Implications and Future Directions

NarrativeQA establishes a new benchmark for developing models that emulate sophisticated human reading strategies, posing practical and theoretical implications for AI advancements. It underscores the need for:

Innovative Retrieval Mechanisms: Effective retrieval of contextually relevant passages from narratives is essential for improving comprehension.
Advanced Integrative Models: Development of architectures capable of synthesizing information over extensive spans of text, enhancing their ability to understand narratives as humans do.
Adaptive Learning Techniques: Models that adaptively learn from varying narrative structures and complexities without extensive reliance on annotated data will be crucial.

In conclusion, the NarrativeQA dataset represents a formidable step towards understanding natural language at a narrative level, offering an invaluable resource for future research. It challenges existing paradigms and drives the innovation necessary to achieve meaningful AI progress in text comprehension.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Jonathan Schwarz (12 papers)
Phil Blunsom (87 papers)
Chris Dyer (91 papers)
Karl Moritz Hermann (22 papers)
Gábor Melis (13 papers)
Edward Grefenstette (66 papers)
Tomáš Kočiský (12 papers)

Citations (681)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/eugeneyan/status/1911110232899821891

YouTube

Show All Videos