2000 character limit reached
BM25 Query Augmentation Learned End-to-End (2305.14087v1)
Published 23 May 2023 in cs.CL and cs.IR
Abstract: Given BM25's enduring competitiveness as an information retrieval baseline, we investigate to what extent it can be even further improved by augmenting and re-weighting its sparse query-vector representation. We propose an approach to learning an augmentation and a re-weighting end-to-end, and we find that our approach improves performance over BM25 while retaining its speed. We furthermore find that the learned augmentations and re-weightings transfer well to unseen datasets.
Collections
Sign up for free to add this paper to one or more collections.