Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

How Reliable is Your Simulator? Analysis on the Limitations of Current LLM-based User Simulators for Conversational Recommendation (2403.16416v1)

Published 25 Mar 2024 in cs.AI

Abstract: Conversational Recommender System (CRS) interacts with users through natural language to understand their preferences and provide personalized recommendations in real-time. CRS has demonstrated significant potential, prompting researchers to address the development of more realistic and reliable user simulators as a key focus. Recently, the capabilities of LLMs have attracted a lot of attention in various fields. Simultaneously, efforts are underway to construct user simulators based on LLMs. While these works showcase innovation, they also come with certain limitations that require attention. In this work, we aim to analyze the limitations of using LLMs in constructing user simulators for CRS, to guide future research. To achieve this goal, we conduct analytical validation on the notable work, iEvaLM. Through multiple experiments on two widely-used datasets in the field of conversational recommendation, we highlight several issues with the current evaluation methods for user simulators based on LLMs: (1) Data leakage, which occurs in conversational history and the user simulator's replies, results in inflated evaluation results. (2) The success of CRS recommendations depends more on the availability and quality of conversational history than on the responses from user simulators. (3) Controlling the output of the user simulator through a single prompt template proves challenging. To overcome these limitations, we propose SimpleUserSim, employing a straightforward strategy to guide the topic toward the target items. Our study validates the ability of CRS models to utilize the interaction information, significantly improving the recommendation results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Neural collaborative filtering. In Proceedings of the 26th international conference on world wide web, pages 173–182, 2017.
  2. Matrix factorization techniques for recommender systems. Computer, 42(8):30–37, 2009.
  3. Advances and challenges in conversational recommender systems: A survey. AI Open, 2:100–126, 2021.
  4. Yueming Sun and Yi Zhang. Conversational recommender system. In The 41st international acm sigir conference on research & development in information retrieval, pages 235–244, 2018.
  5. Leveraging large language models in conversational recommender systems. arXiv preprint arXiv:2305.07961, 2023.
  6. Estimation-action-reflection: Towards deep interaction between conversational and recommender systems. In Proceedings of the 13th International Conference on Web Search and Data Mining, pages 304–312, 2020.
  7. Interactive path reasoning on graph for conversational recommendation. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2073–2083, 2020.
  8. Unified conversational recommendation policy learning via graph-based reinforcement learning. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1431–1441, 2021.
  9. Knowledge graph-enhanced sampling for conversational recommendation system. IEEE Transactions on Knowledge and Data Engineering, 2022.
  10. Towards knowledge-based recommender dialog system. arXiv preprint arXiv:1908.05391, 2019.
  11. Improving conversational recommender systems via knowledge graph based semantic fusion. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1006–1014, 2020.
  12. Barcor: Towards a unified framework for conversational recommendation systems. arXiv preprint arXiv:2203.14257, 2022.
  13. Towards unified conversational recommender systems via knowledge-enhanced prompt learning. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1929–1937, 2022.
  14. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
  15. Testing the general deductive reasoning capacity of large language models using ood examples. arXiv preprint arXiv:2305.15269, 2023.
  16. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
  17. Uncovering chatgpt’s capabilities in recommender systems. In Proceedings of the 17th ACM Conference on Recommender Systems, pages 1126–1132, 2023.
  18. Chat-rec: Towards interactive and explainable llms-augmented recommender system. arXiv preprint arXiv:2303.14524, 2023.
  19. Tallrec: An effective and efficient tuning framework to align large language model with recommendation. arXiv preprint arXiv:2305.00447, 2023.
  20. Large language models are zero-shot rankers for recommender systems. arXiv preprint arXiv:2305.08845, 2023.
  21. Large language models for generative recommendation: A survey and visionary discussions. arXiv preprint arXiv:2309.01157, 2023.
  22. Large language models as zero-shot conversational recommenders. In Proceedings of the 32nd ACM international conference on information and knowledge management, pages 720–730, 2023.
  23. Rethinking the evaluation for conversational recommendation in the era of large language models. arXiv preprint arXiv:2305.13112, 2023.
  24. Recagent: A novel simulation paradigm for recommender systems. arXiv preprint arXiv:2306.02552, 2023.
  25. On generative agents in recommendation. arXiv preprint arXiv:2310.10108, 2023.
  26. Recommender ai agent: Integrating large language models for interactive recommendations. arXiv preprint arXiv:2308.16505, 2023.
  27. Towards deep conversational recommendations. Advances in neural information processing systems, 31, 2018.
  28. Opendialkg: Explainable conversational reasoning with attention-based walks over knowledge graphs. In Proceedings of the 57th annual meeting of the association for computational linguistics, pages 845–854, 2019.
  29. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Lixi Zhu (4 papers)
  2. Xiaowen Huang (12 papers)
  3. Jitao Sang (71 papers)
Citations (3)