Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Incorporating Query Term Independence Assumption for Efficient Retrieval and Ranking using Deep Neural Networks (1907.03693v1)

Published 8 Jul 2019 in cs.IR and cs.LG

Abstract: Classical information retrieval (IR) methods, such as query likelihood and BM25, score documents independently w.r.t. each query term, and then accumulate the scores. Assuming query term independence allows precomputing term-document scores using these models---which can be combined with specialized data structures, such as inverted index, for efficient retrieval. Deep neural IR models, in contrast, compare the whole query to the document and are, therefore, typically employed only for late stage re-ranking. We incorporate query term independence assumption into three state-of-the-art neural IR models: BERT, Duet, and CKNRM---and evaluate their performance on a passage ranking task. Surprisingly, we observe no significant loss in result quality for Duet and CKNRM---and a small degradation in the case of BERT. However, by operating on each query term independently, these otherwise computationally intensive models become amenable to offline precomputation---dramatically reducing the cost of query evaluations employing state-of-the-art neural ranking models. This strategy makes it practical to use deep models for retrieval from large collections---and not restrict their usage to late stage re-ranking.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Bhaskar Mitra (78 papers)
  2. Corby Rosset (21 papers)
  3. David Hawking (2 papers)
  4. Nick Craswell (51 papers)
  5. Fernando Diaz (52 papers)
  6. Emine Yilmaz (66 papers)
Citations (29)