Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey (2208.03197v1)

Published 5 Aug 2022 in cs.CL

Abstract: Dense retrieval (DR) approaches based on powerful pre-trained LLMs (PLMs) achieved significant advances and have become a key component for modern open-domain question-answering systems. However, they require large amounts of manual annotations to perform competitively, which is infeasible to scale. To address this, a growing body of research works have recently focused on improving DR performance under low-resource scenarios. These works differ in what resources they require for training and employ a diverse set of techniques. Understanding such differences is crucial for choosing the right technique under a specific low-resource scenario. To facilitate this understanding, we provide a thorough structured overview of mainstream techniques for low-resource DR. Based on their required resources, we divide the techniques into three main categories: (1) only documents are needed; (2) documents and questions are needed; and (3) documents and question-answer pairs are needed. For every technique, we introduce its general-form algorithm, highlight the open issues and pros and cons. Promising directions are outlined for future research.

Authors (6)

Xiaoyu Shen (73 papers)
Svitlana Vakulenko (31 papers)
Marco del Tredici (13 papers)
Gianni Barlacchi (10 papers)
Bill Byrne (57 papers)
Adrià de Gispert (16 papers)

Citations (18)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey (2208.03197v1)

Summary

Related Papers