USimAgent: Large Language Models for Simulating Search Users (2403.09142v2)
Abstract: Due to the advantages in the cost-efficiency and reproducibility, user simulation has become a promising solution to the user-centric evaluation of information retrieval systems. Nonetheless, accurately simulating user search behaviors has long been a challenge, because users' actions in search are highly complex and driven by intricate cognitive processes such as learning, reasoning, and planning. Recently, LLMs have demonstrated remarked potential in simulating human-level intelligence and have been used in building autonomous agents for various tasks. However, the potential of using LLMs in simulating search behaviors has not yet been fully explored. In this paper, we introduce a LLM-based user search behavior simulator, USimAgent. The proposed simulator can simulate users' querying, clicking, and stopping behaviors during search, and thus, is capable of generating complete search sessions for specific search tasks. Empirical investigation on a real user behavior dataset shows that the proposed simulator outperforms existing methods in query generation and is comparable to traditional methods in predicting user clicks and stopping behaviors. These results not only validate the effectiveness of using LLMs for user simulation but also shed light on the development of a more robust and generic user simulators. The code and data are accessible at https://github.com/Meow-E/USimAgent.
- Leif Azzopardi. 2009. Query side evaluation: an empirical analysis of effectiveness and effort. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, Boston, MA, USA, July 19-23, 2009, James Allan, Javed A. Aslam, Mark Sanderson, ChengXiang Zhai, and Justin Zobel (Eds.). 556–563.
- Building simulated queries for known-item topics: an analysis using six european languages. In SIGIR 2007: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, July 23-27, 2007, Wessel Kraaij, Arjen P. de Vries, Charles L. A. Clarke, Norbert Fuhr, and Noriko Kando (Eds.). 455–462.
- Krisztian Balog and ChengXiang Zhai. 2023. User simulation for evaluating information access systems. In Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region. 302–305.
- Time drives interaction: simulating sessions in diverse searching environments. In The 35th International ACM SIGIR conference on research and development in Information Retrieval, SIGIR ’12, Portland, OR, USA, August 12-16, 2012, William R. Hersh, Jamie Callan, Yoelle Maarek, and Mark Sanderson (Eds.). 105–114.
- A Neural Click Model for Web Search. In Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11 - 15, 2016, Jacqueline Bourdeau, Jim Hendler, Roger Nkambou, Ian Horrocks, and Ben Y. Zhao (Eds.). 531–541.
- Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).
- Dynamic Test Collections for Retrieval Evaluation. In Proceedings of the 2015 International Conference on The Theory of Information Retrieval, ICTIR 2015, Northampton, Massachusetts, USA, September 27-30, 2015, James Allan, W. Bruce Croft, Arjen P. de Vries, and Chengxiang Zhai (Eds.). 91–100.
- Olivier Chapelle and Ya Zhang. 2009. A dynamic bayesian network click model for web search ranking. In Proceedings of the 18th International Conference on World Wide Web, WWW 2009, Madrid, Spain, April 20-24, 2009, Juan Quemada, Gonzalo León, Yoëlle S. Maarek, and Wolfgang Nejdl (Eds.). 1–10.
- PaLM: Scaling Language Modeling with Pathways. CoRR abs/2204.02311 (2022). arXiv:2204.02311
- William S. Cooper. 1973. On selecting a measure of retrieval effectiveness part II. Implementation of the philosophy. J. Am. Soc. Inf. Sci. 24, 6 (1973), 413–424.
- An experimental comparison of click position-bias models. In Proceedings of the International Conference on Web Search and Web Data Mining, WSDM 2008, Palo Alto, California, USA, February 11-12, 2008, Marc Najork, Andrei Z. Broder, and Soumen Chakrabarti (Eds.). 87–94.
- Georges Dupret and Benjamin Piwowarski. 2008. A user browsing model to predict search engine click data from past observations. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2008, Singapore, July 20-24, 2008, Sung-Hyon Myaeng, Douglas W. Oard, Fabrizio Sebastiani, Tat-Seng Chua, and Mun-Kew Leong (Eds.). 331–338.
- Context-Driven Interactive Query Simulations Based on Generative Large Language Models. CoRR abs/2312.09631 (2023). arXiv:2312.09631
- MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge. CoRR abs/2206.08853 (2022).
- CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models. arXiv preprint arXiv:2402.06360 (2024).
- Efficient multiple-click models in web search. In Proceedings of the Second International Conference on Web Search and Web Data Mining, WSDM 2009, Barcelona, Spain, February 9-11, 2009, Ricardo Baeza-Yates, Paolo Boldi, Berthier A. Ribeiro-Neto, and Berkant Barla Cambazoglu (Eds.). 124–131.
- Using controlled query generation to evaluate blind relevance feedback algorithms. In ACM/IEEE Joint Conference on Digital Libraries, JCDL 2006, Chapel Hill, NC, USA, June 11-15, 2006, Proceedings, Gary Marchionini, Michael L. Nelson, and Catherine C. Marshall (Eds.). 286–295.
- Large Language Models are Zero-Shot Reasoners. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.).
- Donald H. Kraft and T. Lee. 1979. Stopping rules and their effect on expected search length. Inf. Process. Manag. 15, 1 (1979), 47–58.
- SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks. CoRR abs/2305.17390 (2023). arXiv:2305.17390
- Investigating Cognitive Effects in Session-level Search User Satisfaction. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019, Anchorage, AK, USA, August 4-8, 2019, Ankur Teredesai, Vipin Kumar, Ying Li, Rómer Rosales, Evimaria Terzi, and George Karypis (Eds.). 923–931.
- Generative Relevance Feedback with Large Language Models. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2023, Taipei, Taiwan, July 23-27, 2023, Hsin-Hsi Chen, Wei-Jou (Edward) Duh, Hen-Hsen Huang, Makoto P. Kato, Josiane Mothe, and Barbara Poblete (Eds.). 2026–2031.
- Self-Refine: Iterative Refinement with Self-Feedback. CoRR abs/2303.17651 (2023). arXiv:2303.17651
- David Maxwell. 2019. Modelling search and stopping in interactive information retrieval. SIGIR Forum 53, 1 (2019), 40–41.
- Searching and Stopping: An Analysis of Stopping Rules and Strategies. In Proceedings of the 24th ACM International Conference on Information and Knowledge Management, CIKM 2015, Melbourne, VIC, Australia, October 19 - 23, 2015, James Bailey, Alistair Moffat, Charu C. Aggarwal, Maarten de Rijke, Ravi Kumar, Vanessa Murdock, Timos K. Sellis, and Jeffrey Xu Yu (Eds.). 313–322.
- Judgment-based and reasoning-based stopping rules in decision making under uncertainty. (01 1995).
- Training language models to follow instructions with human feedback. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.).
- Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July 6-12, 2002, Philadelphia, PA, USA. 311–318.
- Toolformer: Language Models Can Teach Themselves to Use Tools. CoRR abs/2302.04761 (2023).
- HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace. CoRR abs/2303.17580 (2023). arXiv:2303.17580
- Reflexion: Language Agents with Verbal Reinforcement Learning. arXiv:2303.11366
- ViperGPT: Visual Inference via Python Execution for Reasoning. CoRR abs/2303.08128 (2023). arXiv:2303.08128
- Query2doc: Query Expansion with Large Language Models. CoRR abs/2303.07678 (2023). arXiv:2303.07678
- Position Bias Estimation for Unbiased Learning to Rank in Personal Search. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining (Marina Del Rey, CA, USA) (WSDM ’18). Association for Computing Machinery, New York, NY, USA, 610–618.
- Generative Query Reformulation for Effective Adhoc Search. CoRR abs/2308.00415 (2023). arXiv:2308.00415
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.).
- WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents. In Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022, Sanmi Koyejo, S. Mohamed, A. Agarwal, Danielle Belgrave, K. Cho, and A. Oh (Eds.).
- ReAct: Synergizing Reasoning and Acting in Language Models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023.
- Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models. CoRR abs/2310.04406 (2023). https://doi.org/10.48550/ARXIV.2310.04406 arXiv:2310.04406
- Erhan Zhang (5 papers)
- Xingzhu Wang (3 papers)
- Peiyuan Gong (5 papers)
- Yankai Lin (125 papers)
- Jiaxin Mao (47 papers)