Violation of Expectation via Metacognitive Prompting Reduces Theory of Mind Prediction Error in Large Language Models (2310.06983v1)
Abstract: Recent research shows that LLMs exhibit a compelling level of proficiency in Theory of Mind (ToM) tasks. This ability to impute unobservable mental states to others is vital to human social cognition and may prove equally important in principal-agent relations between individual humans and Artificial Intelligences (AIs). In this paper, we explore how a mechanism studied in developmental psychology known as Violation of Expectation (VoE) can be implemented to reduce errors in LLM prediction about users by leveraging emergent ToM affordances. And we introduce a \textit{metacognitive prompting} framework to apply VoE in the context of an AI tutor. By storing and retrieving facts derived in cases where LLM expectation about the user was violated, we find that LLMs are able to learn about users in ways that echo theories of human learning. Finally, we discuss latent hazards and augmentative opportunities associated with modeling user psychology and propose ways to mitigate risk along with possible directions for future inquiry.
- “A method for obtaining digital signatures and public-key cryptosystems” In Communications of the ACM 21.2 ACM New York, NY, USA, 1978, pp. 120–126
- “Constitutional AI: Harmlessness from AI Feedback”, 2022 arXiv:2212.08073 [cs.CL]
- “Explicitly predicting outcomes enhances learning of expectancy-violating information” In Psychonomic bulletin & review 29.6 Springer, 2022, pp. 2192–2201
- “Language models are few-shot learners” In Advances in neural information processing systems 33, 2020, pp. 1877–1901
- David J. Chalmers “Could a Large Language Model be Conscious?”, 2023 arXiv:2303.07103 [cs.AI]
- Andy Clark and David J. Chalmers “The Extended Mind” In Analysis 58.1 Oxford University Press, 1998, pp. 7–19 DOI: 10.1093/analys/58.1.7
- Confidential Computing Consortium “Confidential computing: Hardware-based trusted execution for applications and data” In A Publication of The Confidential Computing Consortium July 2020, 2020
- “The breadth of Shamir’s secret-sharing scheme” In Computers & Security 13.1 Elsevier, 1994, pp. 69–78
- “Guide to attribute based access control (abac) definition and considerations (draft)” In NIST special publication 800.162 Citeseer, 2013, pp. 1–54
- janus “Simulators” In generative.ink, 2022 URL: https://generative.ink/posts/simulators/
- Michael C Jensen and William H Meckling “Theory of the firm: Managerial behavior, agency costs and ownership structure” In Corporate governance Gower, 2019, pp. 77–132
- “Active Retrieval Augmented Generation”, 2023 arXiv:2305.06983 [cs.CL]
- “Large Language Models are Zero-Shot Reasoners”, 2023 arXiv:2205.11916 [cs.CL]
- Michal Kosinski “Theory of Mind May Have Spontaneously Emerged in Large Language Models”, 2023 arXiv:2302.02083 [cs.CL]
- “Theory of mind: a neural prediction problem” In Neuron 79.5 Elsevier, 2013, pp. 836–848
- “Are Emergent Abilities in Large Language Models just In-Context Learning?”, 2023 arXiv:2309.01809 [cs.CL]
- Shima Rahimi Moghaddam and Christopher J. Honey “Boosting Theory-of-Mind Performance in Large Language Models via Prompting”, 2023 arXiv:2304.11490 [cs.AI]
- Kristine H Onishi and Renée Baillargeon “Do 15-month-old infants understand false beliefs?” In science 308.5719 American Association for the Advancement of Science, 2005, pp. 255–258
- “Prompt programming for large language models: Beyond the few-shot paradigm” In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, 2021, pp. 1–7
- “Brain-Inspired Computational Intelligence via Predictive Coding”, 2023 arXiv:2308.07870 [cs.AI]
- Wolfram Schultz, Peter Dayan and P Read Montague “A neural substrate of prediction and reward” In Science 275.5306 American Association for the Advancement of Science, 1997, pp. 1593–1599
- “Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models”, 2023 arXiv:2305.14763 [cs.CL]
- “Reflexion: Language Agents with Verbal Reinforcement Learning”, 2023 arXiv:2303.11366 [cs.AI]
- Tomer Ullman “Large Language Models Fail on Trivial Alterations to Theory-of-Mind Tasks”, 2023 arXiv:2302.08399 [cs.AI]
- “Chain-of-thought prompting elicits reasoning in large language models” In Advances in Neural Information Processing Systems 35, 2022, pp. 24824–24837
- “Emergent Abilities of Large Language Models”, 2022 arXiv:2206.07682 [cs.CL]
- “ReAct: Synergizing Reasoning and Acting in Language Models”, 2023 arXiv:2210.03629 [cs.CL]
- “Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification”, 2023 arXiv:2308.07921 [cs.CL]
- Courtland Leer (1 paper)
- Vincent Trost (1 paper)
- Vineeth Voruganti (1 paper)