Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization (2210.12579v1)

Published 23 Oct 2022 in cs.CL, cs.IR, and cs.LG

Abstract: Efficient k-nearest neighbor search is a fundamental task, foundational for many problems in NLP. When the similarity is measured by dot-product between dual-encoder vectors or $\ell_2$-distance, there already exist many scalable and efficient search methods. But not so when similarity is measured by more accurate and expensive black-box neural similarity models, such as cross-encoders, which jointly encode the query and candidate neighbor. The cross-encoders' high computational cost typically limits their use to reranking candidates retrieved by a cheaper model, such as dual encoder or TF-IDF. However, the accuracy of such a two-stage approach is upper-bounded by the recall of the initial candidate set, and potentially requires additional training to align the auxiliary retrieval model with the cross-encoder model. In this paper, we present an approach that avoids the use of a dual-encoder for retrieval, relying solely on the cross-encoder. Retrieval is made efficient with CUR decomposition, a matrix decomposition approach that approximates all pairwise cross-encoder distances from a small subset of rows and columns of the distance matrix. Indexing items using our approach is computationally cheaper than training an auxiliary dual-encoder model through distillation. Empirically, for k > 10, our approach provides test-time recall-vs-computational cost trade-offs superior to the current widely-used methods that re-rank items retrieved using a dual-encoder or TF-IDF.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (5)

Nishant Yadav (15 papers)
Nicholas Monath (29 papers)
Rico Angell (12 papers)
Manzil Zaheer (89 papers)
Andrew McCallum (132 papers)

Citations (11)

View on Semantic Scholar

Efficient Nearest Neighbor Search for Cross-Encoder Models using Matrix Factorization (2210.12579v1)

Related Papers