Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Character is Destiny: Can Role-Playing Language Agents Make Persona-Driven Decisions? (2404.12138v2)

Published 18 Apr 2024 in cs.AI

Abstract: Can LLMs simulate humans in making important decisions? Recent research has unveiled the potential of using LLMs to develop role-playing language agents (RPLAs), mimicking mainly the knowledge and tones of various characters. However, imitative decision-making necessitates a more nuanced understanding of personas. In this paper, we benchmark the ability of LLMs in persona-driven decision-making. Specifically, we investigate whether LLMs can predict characters' decisions provided by the preceding stories in high-quality novels. Leveraging character analyses written by literary experts, we construct a dataset LIFECHOICE comprising 1,462 characters' decision points from 388 books. Then, we conduct comprehensive experiments on LIFECHOICE, with various LLMs and RPLA methodologies. The results demonstrate that state-of-the-art LLMs exhibit promising capabilities in this task, yet substantial room for improvement remains. Hence, we further propose the CHARMAP method, which adopts persona-based memory retrieval and significantly advances RPLAs on this task, achieving 5.03% increase in accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Anthropic. 2023. Model card and evaluations for claude models.
  2. " let your characters tell their story": A dataset for character-centric narrative understanding. arXiv preprint arXiv:2109.05438.
  3. Pearson correlation coefficient. Noise reduction in speech processing, pages 1–4.
  4. Martin Dodge and Rob Kitchin. 2007. ‘outlines of a world coming into existence’: pervasive computing and the ethics of forgetting. Environment and planning B: planning and design, 34(3):431–445.
  5. Lesley K Fellows. 2004. The cognitive neuroscience of human decision making: a review and conceptual framework. Behavioral and cognitive neuroscience reviews, 3(3):159–172.
  6. Retrieval-augmented generation for large language models: A survey.
  7. Lifelogging: Personal big data. Foundations and Trends® in information retrieval, 8(1):1–125.
  8. Matthew B Hoy. 2018. Alexa, siri, cortana, and more: an introduction to voice assistants. Medical reference services quarterly, 37(1):81–88.
  9. Facial emotion detection using deep learning. In 2020 international conference for emerging technology (INCET), pages 1–5. IEEE.
  10. Mixtral of experts.
  11. Andreas Kaplan and Michael Haenlein. 2019. Siri, siri, in my hand: Who’s the fairest in the land? on the interpretations, illustrations, and implications of artificial intelligence. Business horizons, 62(1):15–25.
  12. Chatharuhi: Reviving anime character in reality via large language model. arXiv preprint arXiv:2308.09597.
  13. Translate meanings, not just words: Idiomkb’s role in optimizing idiomatic translation with language models. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 18554–18563.
  14. Automated extraction of personal knowledge from smartphone push notifications. In 2018 IEEE International Conference on Big Data (Big Data), pages 733–742. IEEE.
  15. Gpteval: Nlg evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634.
  16. Deep learning-based document modeling for personality detection from text. IEEE Intelligent Systems, 32(2):74–79.
  17. A corpus and cloze evaluation for deeper understanding of commonsense stories. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 839–849.
  18. Text and code embeddings by contrastive pre-training.
  19. OpenAI. 2022. Chatgpt.
  20. OpenAI. 2023. Gpt-4 technical report.
  21. Generative agents: Interactive simulacra of human behavior.
  22. The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval, 3(4):333–389.
  23. Lamp: When large language models meet personalization.
  24. Tvshowguess: Character comprehension in stories as speaker guessing.
  25. Character-llm: A trainable agent for role-playing. arXiv preprint arXiv:2310.10158.
  26. Michael Stephen Silk. 2002. Aristophanes and the Definition of Comedy. Oxford University Press, USA.
  27. Alan Sommerstein. 2013. Aristophanes. The Encyclopedia of Ancient History.
  28. Sanja Štajner and Seren Yenikent. 2020. A survey of automatic personality detection from texts. In Proceedings of the 28th international conference on computational linguistics, pages 6284–6295.
  29. Akupm: Attention-enhanced knowledge-aware user preference model for recommendation. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1891–1899.
  30. Gemini Team. 2023. Gemini: A family of highly capable multimodal models.
  31. Llama 2: Open foundation and fine-tuned chat models.
  32. Voltaire. The Philosophy of History.
  33. Incharacter: Evaluating personality fidelity in role-playing agents through psychological interviews.
  34. Rolellm: Benchmarking, eliciting, and enhancing role-playing abilities of large language models.
  35. Recursively summarizing books with human feedback.
  36. Cosplay: Concept set guided personalized dialogue generation across both party personas. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’22. ACM.
  37. Few-shot character understanding in movies as an assessment to meta-learning of theory-of-mind. arXiv preprint arXiv:2211.04684.
  38. Emotion detection of textual data: An interdisciplinary survey. In 2021 IEEE World AI IoT Congress (AIIoT), pages 0255–0261. IEEE.
  39. Characterglm: Customizing chinese conversational ai characters with large language models. arXiv preprint arXiv:2311.16832.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Rui Xu (198 papers)
  2. Xintao Wang (132 papers)
  3. Jiangjie Chen (46 papers)
  4. Siyu Yuan (46 papers)
  5. Xinfeng Yuan (6 papers)
  6. Jiaqing Liang (62 papers)
  7. Zulong Chen (19 papers)
  8. Xiaoqing Dong (2 papers)
  9. Yanghua Xiao (151 papers)