Corpus-Steered Query Expansion with Large Language Models (2402.18031v1)
Abstract: Recent studies demonstrate that query expansions generated by LLMs can considerably enhance information retrieval systems by generating hypothetical documents that answer the queries as expansions. However, challenges arise from misalignments between the expansions and the retrieval corpus, resulting in issues like hallucinations and outdated information due to the limited intrinsic knowledge of LLMs. Inspired by Pseudo Relevance Feedback (PRF), we introduce Corpus-Steered Query Expansion (CSQE) to promote the incorporation of knowledge embedded within the corpus. CSQE utilizes the relevance assessing capability of LLMs to systematically identify pivotal sentences in the initially-retrieved documents. These corpus-originated texts are subsequently used to expand the query together with LLM-knowledge empowered expansions, improving the relevance prediction between the query and the target documents. Extensive experiments reveal that CSQE exhibits strong performance without necessitating any training, especially with queries for which LLMs lack knowledge.
- Gianni Amati and Cornelis Joost Van Rijsbergen. 2002. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst., 20(4):357–389.
- Ms marco: A human generated machine reading comprehension dataset. arXiv preprint arXiv:1611.09268.
- A review of ontology based query expansion. Information Processing & Management, 43(4):866–886.
- Overview of the trec 2019 deep learning track. arXiv preprint arXiv:2003.07820.
- Overview of the trec 2020 deep learning track. arXiv preprint arXiv:2102.07662.
- Perspectives on large language models for relevance judgment. In Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR ’23, page 39–50, Taipei, Taiwan. Association for Computing Machinery.
- Precise zero-shot dense retrieval without relevance labels. arXiv preprint arXiv:2212.10496.
- Unsupervised dense information retrieval with contrastive learning. Transactions on Machine Learning Research.
- Query expansion by prompting large language models. arXiv preprint arXiv:2305.03653.
- Umass at trec 2004: Novelty and hard. In Text Retrieval Conference.
- Large language models struggle to learn long-tail knowledge. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 15696–15707. PMLR.
- RealTime QA: What’s the answer right now? arXiv preprint arXiv:2207.13332.
- Victor Lavrenko and W. Bruce Croft. 2001. Relevance-Based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’01, page 120–127, New Orleans, Louisiana. Association for Computing Machinery.
- Pyserini: A python toolkit for reproducible information retrieval research with sparse and dense representations. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’21, page 2356–2362, Virtual Event, Canada. Association for Computing Machinery.
- Generative relevance feedback with large language models. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’23, page 2026–2031, Taipei, Taiwan. Association for Computing Machinery.
- Inbal Magar and Roy Schwartz. 2022. Data contamination: From memorization to exploitation. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 157–165, Dublin, Ireland. Association for Computational Linguistics.
- Yonggang Qiu and Hans-Peter Frei. 1993. Concept based query expansion. In Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval, SIGIR ’93, page 160–169, Pittsburgh, Pennsylvania. Association for Computing Machinery.
- Stephen Robertson. 1990. On term selection for query expansion. Journal of Documentation, 46:359–364.
- Is ChatGPT good at search? investigating large language models as re-ranking agents. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 14918–14937, Singapore. Association for Computational Linguistics.
- BEIR: A heterogeneous benchmark for zero-shot evaluation of information retrieval models. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2).
- Large language models can accurately predict searcher preferences. arXiv preprint arXiv:2309.10621.
- Query2doc: Query expansion with large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 9414–9423, Singapore. Association for Computational Linguistics.
- When do generative query and document expansions fail? a comprehensive study across methods, retrievers, and datasets. arXiv preprint arXiv:2309.08541.
- Making retrieval-augmented language models robust to irrelevant context. arXiv preprint arXiv:2310.01558.
- Siren’s song in the ai ocean: A survey on hallucination in large language models. arXiv preprint arXiv:2309.01219.