Unsupervised Question Decomposition for Question Answering (2002.09758v3)

Published 22 Feb 2020 in cs.CL, cs.AI, and cs.LG

Abstract: We aim to improve question answering (QA) by decomposing hard questions into simpler sub-questions that existing QA systems are capable of answering. Since labeling questions with decompositions is cumbersome, we take an unsupervised approach to produce sub-questions, also enabling us to leverage millions of questions from the internet. Specifically, we propose an algorithm for One-to-N Unsupervised Sequence transduction (ONUS) that learns to map one hard, multi-hop question to many simpler, single-hop sub-questions. We answer sub-questions with an off-the-shelf QA model and give the resulting answers to a recomposition model that combines them into a final answer. We show large QA improvements on HotpotQA over a strong baseline on the original, out-of-domain, and multi-hop dev sets. ONUS automatically learns to decompose different kinds of questions, while matching the utility of supervised and heuristic decomposition methods for QA and exceeding those methods in fluency. Qualitatively, we find that using sub-questions is promising for shedding light on why a QA system makes a prediction.

Citations (162)

View on Semantic Scholar

Summary

Unsupervised Question Decomposition for Question Answering

The paper "Unsupervised Question Decomposition for Question Answering" presents a novel approach to improve the performance of question-answering (QA) systems by decomposing complex, multi-hop questions into simpler, single-hop sub-questions. This is particularly relevant for QA systems that struggle with answering complex questions requiring reasoning over multiple pieces of evidence.

Methodology

The authors introduce an algorithm termed One-to-N Unsupervised Sequence transduction (ONUS) that facilitates the decomposition of a complex question into simpler sub-questions. The technique leverages unsupervised learning, thus circumventing the need for expensive labeled data typically required for supervised learning methods. More specifically, ONUS maps the distribution of complex questions to the distribution of many simple questions, allowing for training on decomposed questions derived from a vast corpus mined from the internet.

The paper employs a divide-and-conquer strategy where:

Decomposition: The algorithm automatically decomposes a complex question into simpler sub-questions. This is achieved via unsupervised sequence-to-sequence learning without relying on supervised decompositions.
Sub-Question Answering: The simpler sub-questions are then answered using an existing off-the-shelf single-hop QA model.
Recomposition: The answers from the sub-questions are aggregated to form a final answer to the initial complex question using a recomposition model.

Importantly, the ONUS algorithm demonstrates the capability to perform on par with supervised methods in terms of the utility of question decomposition and surpasses these methods in fluency, systematizing combinatorial synthesis of sub-questions.

Results

The proposed model shows significant improvements across standard QA benchmarks. Specifically, the paper reports large gains in QA performance metrics on the HotpotQA dataset, with improvements in F1 score of 3.1 points on the original dev set, 11 points on a multi-hop challenge set, and 10 points on an out-of-domain dev set. The ONUS approach performs comparably to methods that benefit from detailed annotation of relevant sentences, thereby attesting to the potential of unsupervised methods in complex QA tasks.

Implications and Future Directions

From a theoretical perspective, the ONUS algorithm offers a promising alternative to supervised decomposition, with implications for interpreting neural network outputs in QA systems by making the decision-making process more transparent through sub-question generation. Practically, this methodology provides an automated means to harness large-scale, naturally occurring data without the need for costly annotations.

The paper also opens avenues for the application of unsupervised question decomposition across other domains beyond text-based QA, such as visual question answering and fact verification, indicating that the approach could be generalized to various multi-modal and cross-domain QA tasks.

In conclusion, the work represents a step forward in enhancing the capabilities of QA systems by integrating unsupervised learning methodologies to understand and decompose complex queries, holding promise for advancing AI in handling nuanced and intricately structured information tasks.