AmbigQA: Answering Ambiguous Open-domain Questions

Published 22 Apr 2020 in cs.CL and cs.AI | (2004.10645v2)

Abstract: Ambiguity is inherent to open-domain question answering; especially when exploring new topics, it can be difficult to ask questions that have a single, unambiguous answer. In this paper, we introduce AmbigQA, a new open-domain question answering task which involves finding every plausible answer, and then rewriting the question for each one to resolve the ambiguity. To study this task, we construct AmbigNQ, a dataset covering 14,042 questions from NQ-open, an existing open-domain QA benchmark. We find that over half of the questions in NQ-open are ambiguous, with diverse sources of ambiguity such as event and entity references. We also present strong baseline models for AmbigQA which we show benefit from weakly supervised learning that incorporates NQ-open, strongly suggesting our new task and data will support significant future research effort. Our data and baselines are available at https://nlp.cs.washington.edu/ambigqa.

Abstract PDF Upgrade to Chat

Citations (253)

View on Semantic Scholar

Summary

The paper introduces AmbigQA, a new task that identifies and resolves ambiguous questions by generating multiple answers alongside disambiguated rewrites.
The authors develop AmbigNQ, a dataset of 14,042 questions curated from NQ-open, showing over 50% of queries exhibit inherent ambiguity.
Baseline models using weakly supervised approaches demonstrate promise in improving open-domain QA by effectively addressing linguistic ambiguities.

Overview of "AmbigQA: Answering Ambiguous Open-domain Questions"

The paper "AmbigQA: Answering Ambiguous Open-domain Questions" introduces an innovative task in the domain of natural language processing focused on addressing the challenges posed by ambiguity in open-domain question answering (ODQA). The authors, Min et al., propose AmbigQA, a task designed to systematically identify and resolve ambiguous questions by generating all plausible answers along with corresponding disambiguated rewrites of the original question. This contribution is underscored by the introduction of a new dataset, AmbigNQ, consisting of 14,042 questions derived from the NQ-open dataset.

Ambiguity in Open-domain Question Answering

The core problem addressed in this work is the ambiguity inherent in many natural language questions, especially in open-domain settings where questions can refer to multiple events, entities, or temporal contexts. This ambiguity often results in multiple valid answers, posing challenges for current ODQA systems that typically provide a singular response. The AmbigQA task explicitly targets this issue by requiring systems to output a set of plausible answers and reformulate the question for each answer to remove its ambiguity.

Dataset and Task Formulation

AmbigNQ, the dataset developed for this study, is curated from the NQ-open dataset and meticulously annotated to reflect the multiplicity of potential answers for ambiguous questions. The authors demonstrate that over 50% of the questions in NQ-open exhibit some form of ambiguity, necessitating a comprehensive approach to ambiguity resolution. AmbigQA requires a two-step process: generating multiple candidate answers and producing specific question rewrites that clarify the context for each answer.

Baseline Models and Experiments

To establish a foundation for future research, the paper introduces baseline models adapted for the AmbigQA task. These models leverage weakly supervised learning techniques that incorporate the abundant data from NQ-open to enhance performance. Experimental results highlight that these baseline models effectively utilize weakly supervised approaches, showing promise in tackling the ambiguity problem specified by AmbigQA.

Implications and Future Directions

The implications of this research extend both practically and theoretically. Practically, resolving ambiguity in ODQA can lead to the development of more robust question-answering systems capable of handling the nuances of human communication more effectively. Theoretically, the work challenges the community to reevaluate traditional approaches to ODQA, encouraging the development of models that not only predict answers but also engage in deeper linguistic analysis to provide context-specific answers.

Looking forward, several avenues emerge for future exploration: the refinement of models that can seamlessly integrate answer generation and question rewriting, and the exploration of transfer learning techniques that could leverage AmbigNQ annotations to improve other NLP tasks. Furthermore, the dataset and task proposed by Min et al. are poised to drive significant research aimed at overcoming ambiguity, thus advancing the field of NLP toward achieving more nuanced understanding and generation of human language.

The authors provide access to the AmbigQA dataset and associated baselines, fostering an open research environment and inviting further advances in this burgeoning area. For researchers looking to contribute, AmbigQA offers a rich ground for experimentation and innovation in tackling ambiguity in question answering systems.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

AmbigQA: Answering Ambiguous Open-domain Questions

Summary

Overview of "AmbigQA: Answering Ambiguous Open-domain Questions"

Ambiguity in Open-domain Question Answering

Dataset and Task Formulation

Baseline Models and Experiments

Implications and Future Directions

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (4)

Collections

AmbigQA: Answering Ambiguous Open-domain Questions

Summary

Overview of "AmbigQA: Answering Ambiguous Open-domain Questions"

Ambiguity in Open-domain Question Answering

Dataset and Task Formulation

Baseline Models and Experiments

Implications and Future Directions

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (4)

Collections