Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration (2405.16546v2)

Published 26 May 2024 in cs.IR and cs.CL

Abstract: The proliferation of LLMs has led to an influx of AI-generated content (AIGC) on the internet, transforming the corpus of Information Retrieval (IR) systems from solely human-written to a coexistence with LLM-generated content. The impact of this surge in AIGC on IR systems remains an open question, with the primary challenge being the lack of a dedicated benchmark for researchers. In this paper, we introduce Cocktail, a comprehensive benchmark tailored for evaluating IR models in this mixed-sourced data landscape of the LLM era. Cocktail consists of 16 diverse datasets with mixed human-written and LLM-generated corpora across various text retrieval tasks and domains. Additionally, to avoid the potential bias from previously included dataset information in LLMs, we also introduce an up-to-date dataset, named NQ-UTD, with queries derived from recent events. Through conducting over 1,000 experiments to assess state-of-the-art retrieval models against the benchmarked datasets in Cocktail, we uncover a clear trade-off between ranking performance and source bias in neural retrieval models, highlighting the necessity for a balanced approach in designing future IR systems. We hope Cocktail can serve as a foundational resource for IR research in the LLM era, with all data and code publicly available at \url{https://github.com/KID-22/Cocktail}.

References (56)

Authors (9)

Sunhao Dai (22 papers)
Weihao Liu (19 papers)
Yuqi Zhou (31 papers)
Liang Pang (94 papers)
Rongju Ruan (5 papers)
Gang Wang (407 papers)
Zhenhua Dong (76 papers)
Jun Xu (398 papers)
Ji-Rong Wen (299 papers)

Citations (4)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - KID-22/Cocktail: Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration (12 stars)

Tweets

https://twitter.com/_reachsumit/status/1795307356106227867

https://twitter.com/anavictoria_mdc/status/1819407298550514123

https://twitter.com/gm8xx8/status/1795337849161814289

Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration (2405.16546v2)

Summary

Related Papers

GitHub

Tweets