ToM Categorizations: Taxonomies & Computational Models
- Theory-of-mind categorizations are formal frameworks that delineate and systematize the attribution of beliefs, desires, and intentions across human, AI, and hybrid systems.
- They integrate multiple belief orders with neural and symbolic representations to create scalable taxonomies for interactive and adaptive AI performance.
- These categorizations enable rigorous evaluation and optimization of recursive reasoning in multi-agent setups, enhancing narrative understanding and social task performance.
Theory-of-mind (ToM) categorizations refer to formal frameworks, empirical taxonomies, and algorithmic implementations that systematicize the types, levels, and components of ToM reasoning in both humans and artificial agents. These categorizations parse the distinct abilities required to attribute mental states (such as beliefs, desires, intentions, knowledge) to self and others, examine the orders and structure of recursive belief modeling, and clarify the context-dependent ontologies—spanning human-centric, silico-centric (AI-AI), and hybrid interactive scenarios. Recent research extends traditional psychological constructs to explicit computational representations, neural embeddings, and behavioral protocols, yielding a landscape of ToM categories essential for evaluating, interpreting, and engineering ToM in artificial systems.
1. Orders and Modalities of Theory-of-Mind Representation
Centric to ToM categorizations is the formal analysis of belief order—i.e., how deeply an agent can model nested attributions of mental states:
- Zero-order belief (B or B₀): The direct representation of reality, independent of agent perspective.
- First-order belief (Bₚ): What agent p thinks about the world.
- Second-order belief (Bₚ,ₓ): What agent p thinks agent x thinks.
- General k-th order: Bₚ₁,ₚ₂,…,ₚₖ denotes, “what p₁ thinks p₂ thinks … pₖ thinks” about the world.
Such nesting supplies the structural backbone for both symbolic systems and neural models (Sclar et al., 2023, Zhu et al., 28 Feb 2024). In practical computational pipelines (e.g., SYMBOLICTOM), each such belief is maintained as an explicit graph or subspace, with inference mechanisms and knowledge propagation tied to witnessed events and agent participation.
2. Human-Centric, Silico-Centric, and Hybrid ToM Taxonomies
Recent work distinguishes between distinct ontological categories based on the nature of the agents and the information asymmetries present:
Human-Centric ToM
- Attributes beliefs, intentions, and deception to human characters described in narratives.
- Evaluated using developmental psychology protocols (e.g., Strange Stories test, Sally-Anne task), which introduce explicit information asymmetries—false beliefs, hidden facts—to probe first- and higher-order belief attribution (Mukherjee et al., 14 Mar 2024).
Silico-Centric ToM
- Focuses on attributing epistemic and procedural states to artificial agents (e.g., architectural clones of LLMs).
- In the silico-centric paradigm, agents operate under perfect symmetry (e.g., identical weights and knowledge), making traditional guidance informationally redundant:
- Experimental results demonstrate that current LLMs over-attribute uncertainty and produce unnecessary guidance in silico-centric settings, failing to recognize informational symmetry (Mukherjee et al., 14 Mar 2024).
Hybrid and Interactive ToM
- Encompass multi-agent human-AI systems in interactive contexts (e.g., recommendation, games), where nested, mutual modeling and adaptation emerge.
- Taxonomies (see Section 4) formalize different levels of agent agency and recursive meta-modeling (Çelikok et al., 2019).
3. Dimensional and Factorial Decompositions of ToM Abilities
ToM abilities can be decomposed along psychological, computational, and latent-factor axes:
Five-Dimensional ToM in Narrative Understanding
Yu et al. (Yu et al., 2022) introduce a schema explicitly distinguishing:
- Personality (long-term, recurrent)
- Emotion (current affect)
- Belief (current epistemic state)
- Desire (medium-term goals)
- Intention (immediate goal/action)
Formally,
Ablation studies show intention and belief are most critical for few-shot character inference tasks.
Two-Factor ToM in Human Cognitive Performance
Nguyen et al. (Nguyen et al., 7 Nov 2025) use EFA and SEM to extract two latent ToM factors:
- Factor 1 (“Cognitive–Spatial–Emotional”): High recursive thinking, emotional perceptiveness, spatial reasoning.
- Factor 2 (“Interpersonal–Rational”): Deliberative–analytic style, trait empathy.
Their measurement model:
Factor 1 enhances adversarial game performance, while Factor 2 impedes it under time pressure.
Taxonomy Table
| Dimension/Factor | Description/Constituents | Reference |
|---|---|---|
| Belief Order | 0th, 1st, 2nd, ..., k-th order nested belief states | (Sclar et al., 2023) |
| Five Dimensions | Personality, Emotion, Belief, Desire, Intention | (Yu et al., 2022) |
| Factor 1 | Recursive, emotional, spatial reasoning integration | (Nguyen et al., 7 Nov 2025) |
| Factor 2 | Rational, interpersonal skills | (Nguyen et al., 7 Nov 2025) |
| ToM Agency Level | Fixed, Passive, User-planning, Mutual adaptation | (Çelikok et al., 2019) |
4. Levels of Agency, Recursion, and Mutual Modeling
“Interactive AI with a Theory of Mind” (Çelikok et al., 2019) formalizes ToM categorization in user modeling for interactive systems, leading to a four-level taxonomy:
- Level 1: Fixed user model, no adaptation, ToM level 0.
- Level 2: Passive, reactive user; system learns parameters but with no mutual adaptation.
- Level 3: User as bounded rational planner with anticipation, but static system; ToM level 2.
- Level 4: Both user and AI model each other recursively (I-POMDP framework), supporting mutual adaptation; ToM level 3+.
Each level entails increasing degrees of recursive belief reasoning and supports more sophisticated interactive performance (as evidenced by bandit-based proof-of-concept studies).
5. Neural and Symbolic Encodings of ToM Subspaces
Neural ToM representations can be linearly decoded from LLM activations, with distinct subspaces corresponding to self (“oracle”) and other ("protagonist") agent beliefs (Zhu et al., 28 Feb 2024):
- Linear probes identify per-head directions aligned with belief type.
- Manipulating these directions at inference can causally influence model ToM performance, evidenced by increased accuracy in false-belief tasks when w_T+ ("protagonist-true vs. oracle-true" direction) is enhanced.
- The generalization of belief subspaces across tasks suggests a nascent taxonomy of “mental-state directions,” with the possibility of analogously defining and stimulating subspaces for desires, intentions, or emotions.
Hybrid approaches such as SYMBOLICTOM (Sclar et al., 2023) employ explicit symbolic graphs for each order of belief (across all agents), granting interpretability and modularity. Each Bₚ₁,...,ₚₖ graph is updated upon witnessed events, while reality is tracked in a global context node. These representations dramatically boost zero-shot performance on higher-order ToM tasks and generalize robustly across out-of-distribution linguistics and longer narratives.
6. Applications, Limitations, and Open Questions
Contemporary ToM categorizations underpin:
- Evaluation of AI and LLMs on human-like social tasks, including narrative understanding, character inference, and interactive control.
- Design of multi-agent systems and human-computer interaction frameworks that require modeling of nested and dynamic belief states.
- Development of plug-and-play enhancements that explicitly encode ToM reasoning without architectural changes.
However, experimental evidence shows systematic failures:
- LLMs excel at human-centric ToM but fail to recognize informational symmetry in silico-centric settings (Mukherjee et al., 14 Mar 2024).
- Existing neural models may conflate spurious patterns without explicit representation or OOD robustness (Sclar et al., 2023, Yu et al., 2022).
- Most ToM metrics, aside from recursive thinking, are insufficient predictors of effective adaptive behavior in real adversarial settings (Nguyen et al., 7 Nov 2025).
Future work aims at:
- Formalizing belief-operator frameworks for AI-AI and multi-agent systems
- Extending ToM tests to heterogeneous agents and dynamic epistemic states
- Developing more granular, modular, and explainable representations—symbolic and neural—capable of supporting dynamic, real-world multi-agent scenarios (Çelikok et al., 2019, Mukherjee et al., 14 Mar 2024).
7. Synthesis and Future Research Directions
Theory-of-mind categorizations are converging toward multi-dimensional, multi-level frameworks that separate (a) agent ontology (human, AI, hybrid), (b) mental-state dimensions and factors, (c) belief-order depth, and (d) neural, symbolic, and interactive computational instantiations. Advances in explicit belief tracking, neural subspace identification, and interactive agency modeling provide foundational tools for benchmarking and engineering socially aware, collaborative AI. Limitations in generalization, symmetry reasoning, and mutual modeling persist, motivating the development of rigorous formal frameworks and robust evaluation protocols, especially across heterogeneous multi-agent, dynamically evolving systems. This taxonomy is central to the next generation of explainable, adaptive, and context-sensitive AI systems.
Sponsored by Paperpile, the PDF & BibTeX manager trusted by top AI labs.
Get 30 days free