Papers
Topics
Authors
Recent
Search
2000 character limit reached

EnigmaToM: Improve LLMs' Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States

Published 5 Mar 2025 in cs.CL | (2503.03340v2)

Abstract: Theory-of-Mind (ToM), the ability to infer others' perceptions and mental states, is fundamental to human interaction but remains challenging for LLMs. While existing ToM reasoning methods show promise with reasoning via perceptual perspective-taking, they often rely excessively on off-the-shelf LLMs, reducing their efficiency and limiting their applicability to high-order ToM reasoning. To address these issues, we present EnigmaToM, a novel neuro-symbolic framework that enhances ToM reasoning by integrating a Neural Knowledge Base of entity states (Enigma) for (1) a psychology-inspired iterative masking mechanism that facilitates accurate perspective-taking and (2) knowledge injection that elicits key entity information. Enigma generates structured knowledge of entity states to build spatial scene graphs for belief tracking across various ToM orders and enrich events with fine-grained entity state details. Experimental results on ToMi, HiToM, and FANToM benchmarks show that EnigmaToM significantly improves ToM reasoning across LLMs of varying sizes, particularly excelling in high-order reasoning scenarios.

Summary

  • The paper introduces a neuro-symbolic framework that integrates a Neural Knowledge Base to represent entity states for enhanced Theory-of-Mind reasoning.
  • The paper employs an iterative masking mechanism to enable efficient perspective-taking and scalable multi-hop reasoning by reducing the complexity of unobservable events.
  • The paper demonstrates up to a 16% improvement in higher-order ToM accuracy over baselines, highlighting its practical benefits for complex social inference tasks.

EnigmaToM: Enhance LLMs' Theory-of-Mind with Entity-Based Knowledge

Introduction

The paper "EnigmaToM: Improve LLMs' Theory-of-Mind Reasoning Capabilities with Neural Knowledge Base of Entity States" addresses the limitations of LLMs in performing Theory-of-Mind (ToM) reasoning, particularly in higher-order scenarios that require multi-hop reasoning. The approach integrates a neuro-symbolic framework leveraging a Neural Knowledge Base (NKB) for improved ToM capabilities.

Framework Overview

EnigmaToM advances ToM reasoning via a neuro-symbolic framework that includes:

  1. Neural Knowledge Base (NKB): A repository of structured representations of entity states, essential for constructing spatial scene graphs for belief tracking.
  2. Iterative Masking Mechanism: This psychology-inspired technique facilitates efficient perspective-taking by masking unobservable events specific to a given character's viewpoint.
  3. Knowledge Injection: Enriches events with fine-grained entity state details, essential for accurate ToM inference.

This multi-layered approach allows LLMs to offload complex reasoning to a symbolic component, improving efficiency and scalability in ToM tasks. Figure 1

Figure 1: Example use-case of framework in fourth-order ToM reasoning.

Implementation Details

Neural Knowledge Base

The NKB is implemented using a sequence-to-sequence trained Llanguage model. Training utilizes the OpenPI2.0 dataset, which provides human-annotated entity state changes. The NKB generates entity-state information vital for constructing spatial scene graphs and conducting effective event reasoning.

Perspective-Taking Mechanism

Perspective-taking is executed by constructing both omniscient and character-centric spatial scene graphs. Characters' perspectives iteratively mask these graphs to exclude unobservable events. This iterative masking reduces LLMs' reasoning burden by focusing only on relevant, observable events.

Efficiency Considerations

The design of the EnigmaToM framework ensures scalability with respect to both the number of entities and the order of ToM reasoning. The masking mechanism allows the computational complexity to scale linearly with the number of entities, independent of the required depth of reasoning.

Experimental Results

EnigmaToM was tested on various datasets, revealing substantial improvements over baseline methods across all ToM reasoning orders, especially in scenarios requiring higher-order reasoning (up to the fourth order). Figure 2

Figure 2: Relative advantage of on HiToM dataset with respect to ToM order.

  • Higher-Order Reasoning: Demonstrated increased advantages in more complex ToM tasks, with up to 16% improvement in third-order reasoning accuracy.
  • Baseline Comparison: Achieved superior results compared to existing methods such as SimulatedToM and TimeToM, particularly with smaller model architectures.

Analysis

Ablation Study

Removing the iterative masking mechanism resulted in significant performance drops, emphasizing the necessity of perspective-taking. Conversely, knowledge injection had a varied impact, with substantial importance in dialogue-based datasets (e.g., FANToM) but less influence in event-based matrices.

Model Scaling Impact

Further analysis illustrated that scaling the base LLMs affects the efficacy of EnigmaToM, with benefits diminishing for ToMi but enhancing for FANToM with larger models. This underscores the essential role of entity state knowledge compression in narrative interpretation.

Conclusion

EnigmaToM effectively enhances LLMs' ToM reasoning by combining neural representations and symbolic reasoning techniques. Its efficiency and scalability make it a promising approach for deploying AI systems requiring advanced social intelligence capabilities.

Future Scope

Further research could explore non-perceptual ToM reasoning, expanding the NKB's scope to encompass emotional and causal state information. Additionally, improving error propagation control within the masking mechanism will enhance robustness.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.