Contextualized Query Embeddings for Conversational Search (2104.08707v2)

Published 18 Apr 2021 in cs.IR

Abstract: This paper describes a compact and effective model for low-latency passage retrieval in conversational search based on learned dense representations. Prior to our work, the state-of-the-art approach uses a multi-stage pipeline comprising conversational query reformulation and information retrieval modules. Despite its effectiveness, such a pipeline often includes multiple neural models that require long inference times. In addition, independently optimizing each module ignores dependencies among them. To address these shortcomings, we propose to integrate conversational query reformulation directly into a dense retrieval model. To aid in this goal, we create a dataset with pseudo-relevance labels for conversational search to overcome the lack of training data and to explore different training strategies. We demonstrate that our model effectively rewrites conversational queries as dense representations in conversational search and open-domain question answering datasets. Finally, after observing that our model learns to adjust the $L_2$ norm of query token embeddings, we leverage this property for hybrid retrieval and to support error analysis.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Sheng-Chieh Lin (31 papers)
Jheng-Hong Yang (14 papers)
Jimmy Lin (208 papers)

Citations (52)

View on Semantic Scholar

Contextualized Query Embeddings for Conversational Search (2104.08707v2)

Related Papers