Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
51 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Human Simulacra: Benchmarking the Personification of Large Language Models (2402.18180v5)

Published 28 Feb 2024 in cs.CY

Abstract: LLMs are recognized as systems that closely mimic aspects of human intelligence. This capability has attracted attention from the social science community, who see the potential in leveraging LLMs to replace human participants in experiments, thereby reducing research costs and complexity. In this paper, we introduce a framework for LLMs personification, including a strategy for constructing virtual characters' life stories from the ground up, a Multi-Agent Cognitive Mechanism capable of simulating human cognitive processes, and a psychology-guided evaluation method to assess human simulations from both self and observational perspectives. Experimental results demonstrate that our constructed simulacra can produce personified responses that align with their target characters. Our work is a preliminary exploration which offers great potential in practical applications. All the code and datasets will be released, with the hope of inspiring further investigations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774.
  2. R.C. Atkinson and R.M. Shiffrin. 1968. Human memory: A proposed system and its control processes. volume 2 of Psychology of Learning and Motivation, pages 89–195. Academic Press.
  3. Alan D. Baddeley and Graham Hitch. 1974. Working memory. volume 8 of Psychology of Learning and Motivation, pages 47–89. Academic Press.
  4. Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature human behaviour, 2(9):637–644.
  5. Can ai language models replace human participants? Trends in Cognitive Sciences.
  6. Explicit representation of confidence informs future value-based decisions. Nature Human Behaviour, 1(1):0002.
  7. Adrian Furnham. 1996. The big five versus the big four: the relationship between the myers-briggs type indicator (mbti) and neo-pi five factor model of personality. Personality and individual differences, 21(2):303–307.
  8. Ai and the transformation of social science research. Science, 380(6650):1108–1109.
  9. Handbook of personality psychology. Elsevier.
  10. Matthew Hutson and Ashley Mastin. 2023. Guinea pigbots. Science, 381(6654):121–123.
  11. Mistral 7b.
  12. Carl Jung. 1923. Psychological types. Harcourt, Brace.
  13. Beyond rating scales: With targeted evaluation, language models are poised for psychological assessment. Psychiatry Research, page 115667.
  14. Are you in a masquerade? exploring the behavior and impact of large language model driven social bots in online social networks. arXiv preprint arXiv:2307.10337.
  15. Jailbreaking chatgpt via prompt engineering: An empirical study. ArXiv, abs/2305.13860.
  16. Robert R McCrae and Oliver P John. 1992. An introduction to the five-factor model and its applications. Journal of personality, 60(2):175–215.
  17. Situational judgment tests as an alternative measure for personality assessment. European Journal of Psychological Assessment.
  18. Isabel Briggs Myers. 1962. The myers-briggs type indicator: Manual (1962).
  19. Ulric Neisser. 1976. Cognition and Reality: Principles and Implications of Cognitive Psychology. W H Freeman/Times Books/ Henry Holt & Co.
  20. Human problem solving, volume 104. Prentice-hall Englewood Cliffs, NJ.
  21. Limited individual attention and online virality of low-quality information. Nature Human Behaviour, 1(7):1–7.
  22. In-context impersonation reveals large language models’ strengths and biases. arXiv preprint arXiv:2305.14930.
  23. Character-llm: A trainable agent for role-playing. In Proceedings of the 2023 Conference on EMNLP, pages 13153–13187.
  24. Robin James Stuart Sloan. 2015. Virtual character design for games and interactive media. CRC Press.
  25. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288.
  26. Characterchat: Learning towards conversational ai with personalized social support. arXiv preprint arXiv:2308.10278.
  27. Large language models are not fair evaluators. arXiv preprint arXiv:2305.17926.
  28. Rolellm: Benchmarking, eliciting, and enhancing role-playing abilities of large language models. arXiv preprint arXiv:2310.00746.
  29. How johnny can persuade llms to jailbreak them: Rethinking persuasion to challenge ai safety by humanizing llms. arXiv preprint arXiv:2401.06373.
  30. Judging llm-as-a-judge with mt-bench and chatbot arena.
  31. Fine-tuning language models from human preferences. arXiv preprint arXiv:1909.08593.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Qiuejie Xie (1 paper)
  2. Qiming Feng (1 paper)
  3. Tianqi Zhang (17 papers)
  4. Qingqiu Li (11 papers)
  5. Yuejie Zhang (31 papers)
  6. Rui Feng (67 papers)
  7. Shang Gao (74 papers)
  8. Linyi Yang (52 papers)
  9. Liang He (202 papers)
  10. Yue Zhang (618 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com