Faster Learned Sparse Retrieval with Block-Max Pruning (2405.01117v1)
Abstract: Learned sparse retrieval systems aim to combine the effectiveness of contextualized LLMs with the scalability of conventional data structures such as inverted indexes. Nevertheless, the indexes generated by these systems exhibit significant deviations from the ones that use traditional retrieval models, leading to a discrepancy in the performance of existing query optimizations that were specifically developed for traditional structures. These disparities arise from structural variations in query and document statistics, including sub-word tokenization, leading to longer queries, smaller vocabularies, and different score distributions within posting lists. This paper introduces Block-Max Pruning (BMP), an innovative dynamic pruning strategy tailored for indexes arising in learned sparse retrieval environments. BMP employs a block filtering mechanism to divide the document space into small, consecutive document ranges, which are then aggregated and sorted on the fly, and fully processed only as necessary, guided by a defined safe early termination criterion or based on approximate retrieval requirements. Through rigorous experimentation, we show that BMP substantially outperforms existing dynamic pruning strategies, offering unparalleled efficiency in safe retrieval contexts and improved tradeoffs between precision and efficiency in approximate retrieval tasks.
- Vector-space ranking with effective early termination. In Proc. SIGIR. 35–42.
- Interval-based pruning for top-k processing over compressed lists. In Proc. ICDE. 709–720.
- Compressing graphs and indexes with recursive graph bisection. In Proc. SIGKDD. 1535–1544.
- A candidate filtering mechanism for fast top-k query processing on modern cpus. In Proc. SIGIR. 723–732.
- Shuai Ding and Torsten Suel. 2011. Faster Top-k Document Retrieval Using Block-max Indexes. In Proc. SIGIR. 993–1002.
- From distillation to hard negative sampling: Making sparse neural ir models more effective. In Proc. SIGIR. 2353–2359.
- SPLADE: Sparse Lexical and Expansion Model for First Stage Ranking. In Proc. SIGIR. 2288–2292.
- Luyu Gao and Jamie Callan. 2022. Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval. In Proc. ACL. 2843–2853.
- COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List. In Proc. NAACL-HLT. 3030–3042.
- Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and effective passage search via contextualized late interaction over BERT. In Proc. SIGIR. 39–48.
- Carlos Lassance and Stéphane Clinchant. 2022. An efficiency study for SPLADE models. In Proc. SIGIR. 2220–2226.
- Jimmy Lin and Xueguang Ma. 2021. A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques. Preprint: arXiv:2106.14807 (2021).
- Supporting interoperability between open-source search engines with the common index file format. In Proc. SIGIR on Research and Development in Information Retrieval. 2149–2152.
- Accelerating learned sparse indexes via term impact decomposition. In Proc. EMNLP. 2830–2842.
- Compressing inverted indexes with recursive graph bisection: A reproducibility study. In Proc. ECIR. Springer, 339–352.
- IOQP: A simple Impact-Ordered Query Processor written in Rust. In Proc. DESIRES. 22–34.
- Anytime ranking on document-ordered indexes. ACM TOIS 40, 1 (2021), 1–32.
- Faster index reordering with bipartite graph partitioning. In Proc. SIGIR. 1910–1914.
- Efficient document-at-a-time and score-at-a-time query evaluation for learned sparse representations. ACM TOIS 41, 4 (2023), 1–28.
- Learning Passage Impacts for Inverted Indexes. In Proc. SIGIR. 1723–1727.
- Faster Learned Sparse Retrieval with Guided Traversal. In Proc. SIGIR. 1901–1905.
- Faster BlockMax WAND with Variable-sized Blocks. In Proc. SIGIR. 625–634.
- PISA: Performant indexes and search for academia. In Proc. OSIRRC@SIGIR.
- Fast disjunctive candidate generation using live block filtering. In Proc. WSDM. 671–679.
- A comparison of top-k threshold estimation techniques for disjunctive query processing. In Proc. CIKM. 2141–2144.
- MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. In Proc. CoCo@NIPS.
- Exploring the Magic of WAND. In Proc. ADCS. 58–65.
- Optimizing Guided Traversal for Fast Learned Sparse Retrieval. In Proc. WWW. 3375–3385.
- Stephen Robertson and Hugo Zaragoza. 2009. The Probabilistic Relevance Framework: BM25 and Beyond. Found. Trends Inf. Retr. 3, 4 (2009), 333–389.
- Efficient Query Processing for Scalable Web Search. Found. Trends in Inf. Retr. 12, 4–5 (2018), 319–492.
- Howard Turtle and James Flood. 1995. Query evaluation: strategies and optimizations. Information Processing & Management 31, 6 (1995), 831–850.
- Anserini: Enabling the use of Lucene for information retrieval research. In Proc. SIGIR. 1253–1256.
- Optimizing Dense Retrieval Model Training with Hard Negatives. In Proc. SIGIR. 1503–1512.
- Shengyao Zhuang and Guido Zuccon. 2021a. Fast passage re-ranking with contextualized exact term matching and efficient passage expansion. arXiv preprint arXiv:2108.08513 (2021).
- Shengyao Zhuang and Guido Zuccon. 2021b. TILDE: Term Independent Likelihood MoDEl for Passage Re-Ranking. In Proc. SIGIR. 1483––1492.
- Antonio Mallia (9 papers)
- Torten Suel (1 paper)
- Nicola Tonellotto (40 papers)