Dense Passage Retrieval for Open-Domain Question Answering (2004.04906v3)

Published 10 Apr 2020 in cs.CL

Abstract: Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system largely by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.

PDF Abstract

Dense Passage Retrieval for Open-Domain Question Answering

The paper, titled "Dense Passage Retrieval for Open-Domain Question Answering," introduces Dense Passage Retriever (DPR), an advanced method designed to enhance retrieval accuracy in open-domain question answering (QA) systems. The traditional approach to retrieval in QA typically uses sparse vector space models such as TF-IDF or BM25. These methods suffer from limitations such as difficulty in handling lexical variations and semantic relationships. This paper argues for replacing these sparse methods with dense representations obtained via a dual-encoder framework, leveraging embeddings learned from pairs of questions and passages.

Introduction

The authors begin by contextualizing the relevance of efficient passage retrieval in the broader scope of open-domain QA. Traditional QA systems like DrQA or BM25-based approaches rely heavily on sparse retrieval, which often fails to capture semantic nuances. Improvements in QA performance hinge critically on the retrieval phase, as significant degradation occurs when retrieval is suboptimal.

Dense Passage Retriever (DPR)

The core contribution of the paper is DPR, a model employing dense vector representations for both questions and passages. DPR implements a dual-encoder framework, where dense vectors are generated using separate BERT encoders for passages and questions. The passage retrieval task is then framed as a Maximum Inner Product Search (MIPS) problem, differing fundamentally from traditional term-matching techniques.

Training Methodology

A notable aspect of this work is the meticulous approach toward training the dense retrieval model:

In-batch Negatives: Negative passages within the same batch are treated as hard negatives, enhancing the robustness of the learning process.
Loss Function: Negative log-likelihood of positive passage similarity scores is optimized, ensuring high similarity for relevant passages and lower scores for irrelevant ones.

This training approach, coupled with in-batch negatives and a dual-encoder setup, allows DPR to outperform traditional sparse methods such as BM25 significantly. The authors also highlight the efficiency and practicality of their system, achieving substantial improvements without requiring additional pre-training steps often employed by alternative dense retrieval techniques.

Experimental Results

The paper provides empirical evidence of DPR’s superiority through rigorous evaluation on several widely adopted open-domain QA datasets, such as Natural Questions (NQ), TriviaQA, WebQuestions (WQ), CuratedTREC (TREC), and SQuAD v1.1. Key findings include:

Top-20 Retrieval Accuracy: DPR demonstrates an absolute improvement of 9% to 19% over BM25.
End-to-End QA Performance: Systems utilizing DPR achieve new state-of-the-art results on multiple benchmarks, such as a 41.5% Exact Match (EM) on NQ compared to ORQA’s 33.3%.

Ablation Studies and Qualitative Analysis

An insightful aspect of the paper is the comprehensive ablation and qualitative analyses. These include exploring different types of negative passages, the impact of dataset size, alternative similarity functions, and the training loss. Results affirm the robustness of DPR, demonstrating that even with a reduced number of training examples, DPR outperforms BM25. Additionally, qualitative examples illustrate how DPR excels in semantic representation, often retrieving the correct context where sparse methods fail.

Implications and Future Work

The findings have significant implications for both practical and theoretical advancements in QA:

Practical: Enhanced retrieval models like DPR can be integrated into existing QA systems to improve response accuracy and efficiency.
Theoretical: Dense retrieval frameworks challenge the conventional reliance on sparse models, opening possibilities for further exploration in embedding-based retrieval systems.

Future developments may include extending the dual-encoder framework with other state-of-the-art models or exploring efficient mechanisms for dynamically training retrieval models in real-time.

Conclusion

In summary, the paper "Dense Passage Retrieval for Open-Domain Question Answering" convincingly argues for and demonstrates the efficacy of dense passage retrieval in QA systems. By introducing the DPR model, the authors provide a compelling case for replacing traditional sparse methods with dense, trainable representations, revealing new possibilities for improved QA systems. The empirical results, detailed analyses, and thorough experimentation set a strong foundation for future research in dense retrieval for open-domain question answering.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Vladimir Karpukhin (13 papers)
Sewon Min (45 papers)
Patrick Lewis (37 papers)
Ledell Wu (16 papers)
Sergey Edunov (26 papers)
Danqi Chen (84 papers)
Wen-tau Yih (84 papers)
Barlas Oğuz (18 papers)

Citations (3,086)

View on Semantic Scholar

Related Papers

Find Related Papers

Tweets

https://twitter.com/finetunedG/status/1891971091087634690

YouTube

Show All Videos