Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ReSLLM: Large Language Models are Strong Resource Selectors for Federated Search (2401.17645v1)

Published 31 Jan 2024 in cs.IR and cs.AI

Abstract: Federated search, which involves integrating results from multiple independent search engines, will become increasingly pivotal in the context of Retrieval-Augmented Generation pipelines empowering LLM-based applications such as chatbots. These systems often distribute queries among various search engines, ranging from specialized (e.g., PubMed) to general (e.g., Google), based on the nature of user utterances. A critical aspect of federated search is resource selection - the selection of appropriate resources prior to issuing the query to ensure high-quality and rapid responses, and contain costs associated with calling the external search engines. However, current SOTA resource selection methodologies primarily rely on feature-based learning approaches. These methods often involve the labour intensive and expensive creation of training labels for each resource. In contrast, LLMs have exhibited strong effectiveness as zero-shot methods across NLP and IR tasks. We hypothesise that in the context of federated search LLMs can assess the relevance of resources without the need for extensive predefined labels or features. In this paper, we propose ReSLLM. Our ReSLLM method exploits LLMs to drive the selection of resources in federated search in a zero-shot setting. In addition, we devise an unsupervised fine tuning protocol, the Synthetic Label Augmentation Tuning (SLAT), where the relevance of previously logged queries and snippets from resources is predicted using an off-the-shelf LLM and then in turn used to fine-tune ReSLLM with respect to resource selection. Our empirical evaluation and analysis details the factors influencing the effectiveness of LLMs in this context. The results showcase the merits of ReSLLM for resource selection: not only competitive effectiveness in the zero-shot setting, but also obtaining large when fine-tuned using SLAT-protocol.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Jamie Callan. 2002. Distributed information retrieval. In Advances in information retrieval: Recent research from the Center for Intelligent Information Retrieval. Springer, 127–150.
  2. Scaling Instruction-Finetuned Language Models. https://doi.org/10.48550/ARXIV.2210.11416
  3. Learning to rank resources. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. 837–840.
  4. Overview of the TREC 2013 federated web search track. In Text Retrieval Conference (TREC-2013). 1–11.
  5. Overview of the trec 2014 federated web search track. In Proceedings of The Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, November 19-21, 2014.
  6. FedWeb greatest hits: Presenting the new test collection for federated web search. In Proceedings of the 24th International Conference on World Wide Web. 27–28.
  7. University of Padua at TREC 2014: Federated Web Search Track.. In TREC.
  8. Learning To Rank Resources with GNN. In Proceedings of the ACM Web Conference 2023. 3247–3256.
  9. Perspectives on Large Language Models for Relevance Judgment. In Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval (Taipei, Taiwan) (ICTIR ’23). Association for Computing Machinery, New York, NY, USA, 39–50. https://doi.org/10.1145/3578337.3605136
  10. William Falcon and The PyTorch Lightning team. 2019. PyTorch Lightning. https://doi.org/10.5281/zenodo.3828935
  11. Embedding based learning for collection selection in federated search. Data Technologies and Applications 54, 5 (2020), 703–717.
  12. Federated search techniques: an overview of the trends and state of the art. Knowledge and Information Systems 65, 12 (2023), 5065–5095.
  13. Knowledge based collection selection for distributed information retrieval. Information Processing & Management 54, 1 (2018), 116–128.
  14. Shan Jin and Man Lan. 2014. Simple May Be Best-A Simple and Effective Method for Federated Web Search via Search Engine Impact Factor Estimation.. In TREC.
  15. Demonstrate-Search-Predict: Composing Retrieval and Language Models for Knowledge-Intensive NLP. arXiv preprint arXiv:2212.14024 (2022).
  16. DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. arXiv preprint arXiv:2310.03714 (2023).
  17. Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023).
  18. Anagha Kulkarni and Jamie Callan. 2015. Selective search: Efficient and effective search of large textual collections. ACM Transactions on Information Systems (TOIS) 33, 4 (2015), 1–33.
  19. Jerry Liu. 2022. LlamaIndex. https://doi.org/10.5281/zenodo.1234
  20. Zero-Shot Listwise Document Reranking with a Large Language Model. arXiv preprint arXiv:2305.02156 (2023).
  21. Reducing hardware hit by queries in web search engines. Information Processing & Management 52, 6 (2016), 1031–1052.
  22. MTEB: Massive text embedding benchmark. arXiv preprint arXiv:2210.07316 (2022).
  23. Federated search in the wild: the combined power of over a hundred search engines. In Proceedings of the 21st ACM international conference on Information and knowledge management. 1874–1878.
  24. Resource selection for federated search on the web. arXiv preprint arXiv:1609.04556 (2016).
  25. RankVicuna: Zero-Shot Listwise Document Reranking with Open-Source Large Language Models. arXiv preprint arXiv:2309.15088 (2023).
  26. RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze! arXiv preprint arXiv:2312.02724 (2023).
  27. Large language models are effective text rankers with pairwise ranking prompting. arXiv preprint arXiv:2306.17563 (2023).
  28. Improving Passage Retrieval with Zero-Shot Question Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 3781–3797. https://doi.org/10.18653/v1/2022.emnlp-main.249
  29. Prompting large language models with answer heuristics for knowledge-based visual question answering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14974–14983.
  30. Milad Shokouhi. 2007. Central-rank-based collection selection in uncooperative distributed information retrieval. In European Conference on Information Retrieval. Springer, 160–172.
  31. Federated search. Foundations and Trends® in Information Retrieval 5, 1 (2011), 1–102.
  32. Luo Si and Jamie Callan. 2003. Relevant document distribution estimation method for resource selection. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval. 298–305.
  33. Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 14918–14937. https://doi.org/10.18653/v1/2023.emnlp-main.923
  34. Scaling Down, LiTting Up: Efficient Zero-Shot Listwise Reranking with Seq2seq Encoder-Decoder Models. arXiv preprint arXiv:2312.16098 (2023).
  35. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca.
  36. Paul Thomas and Milad Shokouhi. 2009. Sushi: Scoring scaled samples for server selection. In Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval. 419–426.
  37. Large language models can accurately predict searcher preferences. arXiv preprint arXiv:2309.10621 (2023).
  38. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
  39. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
  40. Source selection of long tail sources for federated search in an uncooperative setting. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing. 720–727.
  41. Generating natural language queries for more effective systematic review screening prioritisation. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 73–83.
  42. Zero-shot Generative Large Language Models for Systematic Review Screening Automation. arXiv preprint arXiv:2401.06320 (2024).
  43. Shuai Wang and Guido Zuccon. 2023. Balanced Topic Aware Sampling for Effective Dense Retriever: A Reproducibility Study. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (¡conf-loc¿, ¡city¿Taipei¡/city¿, ¡country¿Taiwan¡/country¿, ¡/conf-loc¿) (SIGIR ’23). Association for Computing Machinery, New York, NY, USA, 2542–2551. https://doi.org/10.1145/3539618.3591915
  44. Ltrrs: a learning to rank based algorithm for resource selection in distributed information retrieval. In Information Retrieval: 25th China Conference, CCIR 2019, Fuzhou, China, September 20–22, 2019, Proceedings 25. Springer, 52–63.
  45. C-Pack: Packaged Resources To Advance General Chinese Embedding. arXiv:2309.07597 [cs.CL]
  46. Jinxi Xu and W Bruce Croft. 1999. Cluster-based language models for distributed retrieval. In Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval. 254–261.
  47. Exploring the limits of chatgpt for query or aspect-based text summarization. arXiv preprint arXiv:2302.08081 (2023).
  48. Benchmarking large language models for news summarization. arXiv preprint arXiv:2301.13848 (2023).
  49. Haozhen Zhao and Xiaohua Hu. 2014. Drexel at TREC 2014 Federated Web Search Track.. In TREC.
  50. Dense text retrieval based on pretrained language models: A survey. arXiv preprint arXiv:2211.14876 (2022).
  51. Beyond yes and no: Improving zero-shot llm rankers via scoring fine-grained relevance labels. arXiv preprint arXiv:2310.14122 (2023).
  52. Open-source Large Language Models are Strong Zero-shot Query Likelihood Models for Document Ranking. In Findings of the Association for Computational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 8807–8817. https://doi.org/10.18653/v1/2023.findings-emnlp.590
  53. A Setwise Approach for Effective and Highly Efficient Zero-shot Ranking with Large Language Models. arXiv preprint arXiv:2310.09497 (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Shuai Wang (466 papers)
  2. Shengyao Zhuang (42 papers)
  3. Bevan Koopman (37 papers)
  4. Guido Zuccon (73 papers)
Citations (1)