Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KuaiRec: A Fully-observed Dataset and Insights for Evaluating Recommender Systems (2202.10842v3)

Published 22 Feb 2022 in cs.IR and cs.HC

Abstract: The progress of recommender systems is hampered mainly by evaluation as it requires real-time interactions between humans and systems, which is too laborious and expensive. This issue is usually approached by utilizing the interaction history to conduct offline evaluation. However, existing datasets of user-item interactions are partially observed, leaving it unclear how and to what extent the missing interactions will influence the evaluation. To answer this question, we collect a fully-observed dataset from Kuaishou's online environment, where almost all 1,411 users have been exposed to all 3,327 items. To the best of our knowledge, this is the first real-world fully-observed data with millions of user-item interactions. With this unique dataset, we conduct a preliminary analysis of how the two factors - data density and exposure bias - affect the evaluation results of multi-round conversational recommendation. Our main discoveries are that the performance ranking of different methods varies with the two factors, and this effect can only be alleviated in certain cases by estimating missing interactions for user simulation. This demonstrates the necessity of the fully-observed dataset. We release the dataset and the pipeline implementation for evaluation at https://kuairec.com

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Chongming Gao (28 papers)
  2. Shijun Li (11 papers)
  3. Wenqiang Lei (66 papers)
  4. Jiawei Chen (161 papers)
  5. Biao Li (41 papers)
  6. Peng Jiang (274 papers)
  7. Xiangnan He (200 papers)
  8. Jiaxin Mao (47 papers)
  9. Tat-Seng Chua (360 papers)
Citations (108)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com