Improving Passage Retrieval with Zero-Shot Question Generation (2204.07496v4)

Published 15 Apr 2022 in cs.CL and cs.IR

Abstract: We propose a simple and effective re-ranking method for improving passage retrieval in open question answering. The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained LLM to compute the probability of the input question conditioned on a retrieved passage. This approach can be applied on top of any retrieval method (e.g. neural or keyword-based), does not require any domain- or task-specific training (and therefore is expected to generalize better to data distribution shifts), and provides rich cross-attention between query and passage (i.e. it must explain every token in the question). When evaluated on a number of open-domain retrieval datasets, our re-ranker improves strong unsupervised retrieval models by 6%-18% absolute and strong supervised models by up to 12% in terms of top-20 passage retrieval accuracy. We also obtain new state-of-the-art results on full open-domain question answering by simply adding the new re-ranker to existing models with no further changes.

PDF Abstract

Improving Passage Retrieval with Zero-Shot Question Generation

This paper presents a novel approach for enhancing passage retrieval in open-domain question answering using a re-ranking method based on zero-shot question generation. The core idea is to utilize pre-trained LLMs (PLMs) to compute the likelihood of generating an input question given a retrieved passage. This methodology benefits from the expressive cross-attention between query and passage tokens and importantly operates in a zero-shot setting, requiring no domain-specific training data and promising better generalization across different datasets.

Key Contributions and Results

Re-ranking Strategy: The paper proposes the Unsupervised Passage Re-ranker (UPR) that re-scores retrieved passages by estimating the probability of generating the question conditioned on the passage. This task-independent method can be applied on top of various retrieval models, whether they are neural or keyword-based, offering significant practical feasibility.
Significant Performance Gains: UPR demonstrates substantial improvements in top-20 passage retrieval accuracy ranging from 6% to 18% for unsupervised models and up to 12% for supervised models across multiple open-domain QA datasets, such as SQuAD-Open, TriviaQA, and Natural Questions. It achieves state-of-the-art results when integrated with existing models, effectively outperforming Dense Passage Retrieval (DPR) benchmarks.
PLM Utilization and Scalability: Leveraging large-scale PLMs like T5 and GPT, the method benefits from their zero-shot capabilities that are further enhanced by instruction-tuned models such as T0. Given the rapid evolution of PLM technology, UPR's efficacy could improve alongside advancements in these models without necessitating additional training or fine-tuning.

Theoretical and Practical Implications

The re-ranking approach introduced by UPR has significant implications for both theoretical and practical aspects of AI:

Task-Independent Passage Re-ranking: UPR shows that powerful cross-attentional mechanisms inherent in PLMs can be harnessed to significantly improve the interpretation and retrieval of contextually relevant passages without bespoke model training, offering a generalized solution across diverse domains.
Efficiency and Application: This method adapts well to datasets that challenge dense retrievers, supporting its use in varied retrieval contexts such as entity-centric questions and heterogeneous retrieval benchmarks like BEIR. The ability to utilize re-ranking without domain-specific finetuning streamlines operations for real-world applications where annotated data may be sparse or costly to obtain.

Future Directions

Given the capabilities demonstrated by UPR on various datasets without domain-specific adjustments, future research could explore:

Adapting to Domain-Specific Retrieval Tasks: Investigating prompt engineering and instruction tuning further could refine UPR’s adaptability and accuracy for specialized queries and domain-centric datasets.
Exploring Efficient PLM Execution: Enhancing PLM inferencing methods through techniques like quantization, distillation, and parallelization could address latency issues, making UPR more viable for deployment in settings that demand high throughput.

Through an unsupervised approach augmented by the robust capabilities of PLMs, UPR offers a compelling direction for improving passage retrieval strategies in open-domain question answering. Its scalability and efficacy highlight the transformational potential of zero-shot methods in AI-driven language understanding and retrieval tasks.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Devendra Singh Sachan (16 papers)
Mike Lewis (78 papers)
Mandar Joshi (24 papers)
Armen Aghajanyan (31 papers)
Wen-tau Yih (84 papers)
Joelle Pineau (123 papers)
Luke Zettlemoyer (225 papers)

Citations (127)

View on Semantic Scholar

Related Papers

Find Related Papers

YouTube

Show All Videos