Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decompose-ToM: Enhancing Theory of Mind Reasoning in Large Language Models through Simulation and Task Decomposition (2501.09056v1)

Published 15 Jan 2025 in cs.CL and cs.AI

Abstract: Theory of Mind (ToM) is the ability to understand and reflect on the mental states of others. Although this capability is crucial for human interaction, testing on LLMs reveals that they possess only a rudimentary understanding of it. Although the most capable closed-source LLMs have come close to human performance on some ToM tasks, they still perform poorly on complex variations of the task that involve more structured reasoning. In this work, we utilize the concept of "pretend-play", or Simulation Theory'' from cognitive psychology to proposeDecompose-ToM'': an LLM-based inference algorithm that improves model performance on complex ToM tasks. We recursively simulate user perspectives and decompose the ToM task into a simpler set of functions: subject identification, question-reframing, world model updation, and knowledge availability. We test the algorithm on higher-order ToM tasks and a task testing for ToM capabilities in a conversational setting, demonstrating that our approach shows significant improvement across models compared to baseline methods while requiring minimal prompt tuning across tasks and no additional model training.

Enhancing Theory of Mind in LLMs via Task Decomposition

The paper "Decompose-ToM: Enhancing Theory of Mind Reasoning in LLMs through Simulation and Task Decomposition" addresses a critical aspect of natural language processing and artificial intelligence: the capability of LLMs to perform Theory of Mind (ToM) reasoning. Theory of Mind pertains to the ability to attribute and infer the mental states of others. Despite recent advances, LLMs display limitations in higher-order ToM tasks, particularly those requiring intricate structured reasoning. The authors propose a novel inference algorithm, Decompose-ToM, which leverages cognitive psychology concepts like pretend play and knowledge-access to improve LLM performance on complex ToM tasks.

Methodology and Evaluation

The proposed algorithm, Decompose-ToM, deconstructs the ToM reasoning challenge into two streamlined subtasks: recursive simulation of an agent's perspectives and a granular statement awareness problem, which informs the agent of relevant information pieces. This method builds on developmental psychology insights wherein pretend-play serves as a precursor to ToM development in children and incorporates computational strategies inspired by Rational Speech Act models. By cyclically simulating agent perspectives and rewording ToM questions to factual inquiries, Decompose-ToM significantly enhances the capability of LLMs to reason about complex mental states.

Empirical evaluations were conducted on the Hi-ToM and FANToM datasets, which are designed to probe LLMs' understanding of ToM through dialogue-based interactions and higher-order reasoning tests. Performance metrics, such as accuracy on multiple-choice questions, indicate marked improvements in task performance across several models when utilizing Decompose-ToM compared to traditional methods like zero-shot prompting or Chain of Thought reasoning.

Results and Implications

Notably, applying Decompose-ToM led to a substantial improvement in accuracy for larger LLMs such as GPT-4o and Llama-3-70B in higher-order ToM tasks, with modest gains also observed for smaller models. Importantly, the method effectively mitigates performance degradation typically observed with increased context length, delivering almost uniform accuracy for tasks across varying complexities.

This research extends theoretical implications by elucidating how cognitive developmental methodologies can fortify LLM ToM capabilities without necessitating new model architectures or intensive prompt calibrations. Practically, this approach paves the way toward building advanced, socially-aware AI systems capable of nuanced human-like social interactions, supporting applications in domains such as human-computer interaction, empathic communication systems, and advanced virtual assistants.

Future Directions

Anticipating future developments, further refinement of Decompose-ToM's computational efficiency should be explored, as well as its adaptability across broader, more diverse tasks. Additionally, integrating this framework into agentic LLM paradigms could enable autonomous task decomposition conducted by the models themselves, further enhancing the applicability and robustness of this method.

In conclusion, Decompose-ToM represents a significant step forward in aligning LLM capabilities with human social reasoning, functioning as a scalable and versatile approach to tackling higher-order and naturalistic ToM tasks in artificial systems.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Sneheel Sarangi (2 papers)
  2. Maha Elgarf (1 paper)
  3. Hanan Salam (10 papers)