Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Multi-hop Question Answering by Question Generation (2010.12623v2)

Published 23 Oct 2020 in cs.CL and cs.AI

Abstract: Obtaining training data for multi-hop question answering (QA) is time-consuming and resource-intensive. We explore the possibility to train a well-performed multi-hop QA model without referencing any human-labeled multi-hop question-answer pairs, i.e., unsupervised multi-hop QA. We propose MQA-QG, an unsupervised framework that can generate human-like multi-hop training data from both homogeneous and heterogeneous data sources. MQA-QG generates questions by first selecting/generating relevant information from each data source and then integrating the multiple information to form a multi-hop question. Using only generated training data, we can train a competent multi-hop QA which achieves 61% and 83% of the supervised learning performance for the HybridQA and the HotpotQA dataset, respectively. We also show that pretraining the QA system with the generated data would greatly reduce the demand for human-annotated training data. Our codes are publicly available at https://github.com/teacherpeterpan/Unsupervised-Multi-hop-QA.

Unsupervised Multi-hop Question Answering by Question Generation

The paper introduces MQA-QG, a novel unsupervised framework designed to train multi-hop question answering models without the need for human-labeled multi-hop question-answer pairs. Recognizing the challenge of data scarcity, where annotating multi-hop QA datasets is resource-intensive owing to their complexity, the authors propose a method to generate training data automatically.

Framework Overview

MQA-QG operates over both homogeneous and heterogeneous data sources to synthetically create training datasets. The process involves a two-step approach: initially selecting or generating relevant information from different data sources, and subsequently integrating these pieces to structure coherent multi-hop questions. For training efficiency, MQA-QG employs a series of operators that handle tasks like entity selection (FindBridge), entity description (DescribeEnt), and question generation with specific conditions (QGwithAns and QGwithEnt). The synthesis of multi-hop questions is facilitated by operators such as BridgeBlend and CompBlend which blend single-hop questions into composite multi-hop forms.

Experimental Evaluation

The framework was evaluated on two distinct multi-hop QA datasets: HotpotQA, which involves text-only reasoning, and HybridQA, which combines both table and text data sources. The paper demonstrates that using only generated data, MQA-QG achieves 61% and 83% of the fully supervised performance on HybridQA and HotpotQA respectively, indicating that synthetic data can effectively pretrain models to reduce reliance on human annotations.

Additionally, the framework is found to be beneficial in few-shot learning scenarios, significantly boosting model performance when only a handful of labeled samples are available. For example, combining MQA-QG pretraining with 50 labeled examples on the HotpotQA dataset raised the F1 score from 21.6 to 64.6, showing a substantial reduction in data requirements.

Implications and Future Research

The implications of MQA-QG are profound for the development of QA systems, especially in scenarios where data labeling is prohibitive. By assembling robust training datasets with minimal human intervention, this framework paves the way for deploying QA systems in low-resource domains or with new document types.

Future research could explore expanding the framework to incorporate additional modalities beyond text and tables, such as integrating visual data for richer reasoning tasks. Moreover, refining the question generation process to further enhance semantic coherence and naturalness of the generated questions could provide even greater alignment with human intuition, enhancing the utility of the synthetically generated datasets.

In summary, the research suggests a promising direction towards reducing the bottleneck of labeled data in multi-hop QA through automated, unsupervised methodologies.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Liangming Pan (59 papers)
  2. Wenhu Chen (134 papers)
  3. Wenhan Xiong (47 papers)
  4. Min-Yen Kan (92 papers)
  5. William Yang Wang (254 papers)
Citations (54)