Satisficing Mentalizing in Bayesian ToM
- Satisficing mentalizing is a framework that simplifies Bayesian inference by using computationally efficient heuristic models to infer mental states under bounded rationality.
- The approach reduces complexity by clamping certain latent variables, with specialized models and an adaptive switching strategy that dynamically responds to prediction errors.
- Empirical studies show that switching models achieve near-optimal predictive performance at significantly reduced runtimes compared to full Bayesian methods.
Satisficing mentalizing refers to an inference regime within Bayesian Theory of Mind (BToM) that prioritizes computational efficiency by balancing accuracy and runtime requirements when attributing mental states to agents based on observed behavior. This approach is motivated by the recognition that both humans and artificial systems often operate under bounded rationality, rendering fully Bayesian inference over all possible mental-state configurations intractable. Instead, satisficing mentalizing employs simplified or heuristic models that are computationally tractable and empirically sufficient for decision-making, particularly in real-time or resource-constrained settings (Pöppel et al., 2019).
1. Conceptual Foundations and Motivation
Satisficing mentalizing builds directly on the concept of bounded rationality as articulated by Simon, emphasizing the need to reconcile ideal inference with practical tractability. In BToM, the goal is to infer latent mental-state variables—such as an agent’s goal, belief about the world, and belief about locations or assignments—by observing their actions. Because the number of possible combinations of these latent states grows exponentially, exact Bayesian inference rapidly becomes infeasible. Satisficing mentalizing therefore comprises strategies that deliver inference quality “good enough” to guide behavior effectively, at a fraction of the computational cost required for full Bayesian updating. This is analogous to the use of heuristics in human cognition and practical artificial reasoning systems.
2. The Full Bayesian ToM Model
The full BToM model formalizes mentalizing inference as a joint posterior over the set of possible goals , goal-beliefs , and world-beliefs , conditioned on observed action histories:
- : possible goals (e.g., four colored exits)
- : all possible assignments of colors to exit locations (24 configurations)
- : world-beliefs—either true-layout or an unknown layout (freespace assumption)
- : action at time ;
The model employs uniform priors and a Boltzmann policy for action likelihood:
where 0 is negative distance-to-goal (using the agent’s current belief about the environment), and 1 is an inverse temperature parameter controlling rationality.
Posterior inference follows by Bayes’ rule:
2
and marginalization yields the next-action prediction:
3
Although statistically optimal, this workflow incurs exponential complexity with respect to the number of latent variable combinations ((Pöppel et al., 2019), Section 2).
3. Specialized Simplified Bayesian Models
To achieve tractability, specialized models clamp one or more mental-state variables to their true or default values, dramatically reducing inference complexity:
| Model | Clamped Variable(s) | Complexity |
|---|---|---|
| True World & Goal (TWG) | 4 | 5 per action |
| True World (TW) | 6 | 7 |
| True Goal (TG) | 8 | 9 |
- TWG: Assumes agent holds true beliefs about both world and goal; sums only over possible goals.
- TW: Agent’s belief about world is accurate, but goal-belief is variable.
- TG: Agent’s goal-belief is accurate, but world-belief may be inaccurate; in path uncertainty conditions, always assumes a freespace world-belief model.
Under these restrictions, inference and action prediction entail only linear or bilinear summations, as opposed to full joint enumeration ((Pöppel et al., 2019), Section 3).
4. The Switching Approach and Surprise-Driven Adaptation
The switching approach dynamically selects between specialized models by monitoring prediction error (“surprise”) over observed actions:
- Maintains a current model 0
- After each action 1, computes 2, with options:
- 3 (self-information)
- 4 (Itti–Baldi style)
- Cumulative surprise 5 is compared to a threshold 6; exceeding 7 triggers reevaluation of all specialized models’ cumulative surprise on the full action history, switching to the one with minimal total surprise.
- The threshold 8 is increased (e.g., 9) after each switch to prevent flip-flopping.
Pseudocode for the switching protocol is explicitly detailed in (Pöppel et al., 2019) (Section 4). This mechanism avoids full Bayesian inversion while maintaining responsiveness to changes in uncertainty by adapting model complexity to observed behavior.
5. Computational Complexity and Predictive Performance
Empirical analysis demonstrates substantial computational gains for satisficing models, particularly the switching approach, with corresponding performance:
| Model | Mean Runtime (ms/action) | Rel. Runtime to TWG | Avg. Neg. Log-Likelihood 0 |
|---|---|---|---|
| TWG | 1 | 2 | 3 |
| TG | 4 | 5 | 6 |
| TW | 7 | 8 | 9 |
| Switching | 0 | 1 | 2 |
| Full BToM | 3 | 4 | 5 |
The switching approach achieves the lowest average negative log-likelihood of 6 across all 687 observed trajectories, outperforming the full model, with runtime an order of magnitude faster than the full Bayesian ToM. In head-to-head comparisons, the switching model outperforms the full model on 7 of trajectories, TG and TW on 8 each, and TWG on 9 ((Pöppel et al., 2019), Section 8).
6. Uncertainty Scenarios and Human Behavioral Correlates
Empirical validation relies on a human study with 110 participants solving maze tasks under three uncertainty conditions:
- No Uncertainty (NU): Agent knows the maze layout and correct exit location; true beliefs about goal and world.
- Destination Uncertainty (DU): Maze layout known, but exit color-location unknown until sighted; uncertainty over 0.
- Path Uncertainty (PU): Exit known, but layout beyond a local radius hidden; uncertainty over 1.
Model fit (average 2) in each scenario:
| Condition | TWG | TW | TG | Switching | Full |
|---|---|---|---|---|---|
| NU | 0.59 | 0.62 | 0.63 | 0.59 | 0.60 |
| DU | 0.80 | 0.65 | 0.61 | 0.61 | 0.68 |
| PU | 1.12 | 0.91 | 0.74 | 0.73 | 1.08 |
The switching model consistently achieves best or near-best fit, while maintaining efficient runtime.
Participants’ trajectories exhibited path optimality within 20% of the shortest possible path at rates of 3 (NU), 4 (DU), and 5 (PU), with step count over optimal at 6, 7, and 8 excess, respectively ((Pöppel et al., 2019), Section 7).
7. Satisficing Outcome and Implications
The empirical results indicate that satisficing mentalizing via the switching approach provides a compromise solution: it yields predictive accuracy closely matching that of the most specialized models for each uncertainty regime, adapts automatically to shifting uncertainty, and avoids the exponential computational burden of full-state Bayesian inference. This suggests that both human mentalizing and practical artificial systems may benefit from modular, adaptive inference architectures that leverage situation-dependent model selection without sacrificing responsiveness. The strong performance of the switching approach underpins its potential as a satisficing strategy for efficient, context-sensitive social reasoning (Pöppel et al., 2019).