Wanting to be Understood (2504.06611v2)

Published 9 Apr 2025 in cs.LG, cs.AI, and cs.CL

Abstract: This paper explores an intrinsic motivation for mutual awareness, hypothesizing that humans possess a fundamental drive to understand and to be understood even in the absence of extrinsic rewards. Through simulations of the perceptual crossing paradigm, we explore the effect of various internal reward functions in reinforcement learning agents. The drive to understand is implemented as an active inference type artificial curiosity reward, whereas the drive to be understood is implemented through intrinsic rewards for imitation, influence/impressionability, and sub-reaction time anticipation of the other. Results indicate that while artificial curiosity alone does not lead to a preference for social interaction, rewards emphasizing reciprocal understanding successfully drive agents to prioritize interaction. We demonstrate that this intrinsic motivation can facilitate cooperation in tasks where only one agent receives extrinsic reward for the behaviour of the other.

Summary

The paper demonstrates that intrinsic reward mechanisms based on imitation and mutual influence markedly enhance coordinated agent behavior compared to artificial curiosity alone.
The study employs reinforcement learning with PPO and LSTM networks to simulate social interaction through various intrinsic reward functions in the Perceptual Crossing Paradigm.
Results show that these intrinsic social drives enable effective cooperation even in asymmetric reward scenarios, suggesting a key role in bootstrapping human-like interaction.

This paper investigates the hypothesis that humans possess an intrinsic motivation not only to understand others but also to be understood by them, even without external rewards. This drive is proposed as a key factor in developing primary intersubjectivity (basic dyadic interaction) and, eventually, secondary intersubjectivity (interaction involving shared objects/context) and shared intentionality. The authors argue that existing intrinsic motivations like artificial curiosity (a drive to understand) are insufficient to explain the strong human preference for social interaction.

To test this, the researchers simulate the Perceptual Crossing Paradigm (PCP) using reinforcement learning (RL) agents. In the PCP, two agents move on a 1D line and perceive a "buzz" when crossing each other, a private stationary object, or the other agent's "shadow" (which only gives a buzz to the crossing agent). The agents use Proximal Policy Optimization (PPO) with LSTM-based policy networks. Each agent also has an auxiliary LSTM predictor trained offline to predict upcoming crossing events.

Several intrinsic reward functions are compared:

Drive to Understand (Artificial Curiosity): Rewards agents for making correct predictions when uncertain and incorrect predictions (surprises) when certain. The goal is to seek out informative, surprising situations.
Drive to Be Understood (Implemented in three ways):
- Imitation and Being Imitated: Agents are rewarded for observing a pattern of crossings passively and then actively replicating it (Imitation Reward) or for actively creating a pattern that is subsequently passively observed as being replicated by the other (Promoting Imitation Reward). Specific comparison windows (5 steps vs. 5 steps in a 10-step history) are used.
- Influence and Impressionability: Rewards maximizing the mutual information (MI) between an agent's recent actions and the other's subsequent observations (Influence), plus the MI between the agent's observations of the other and its own subsequent actions (Impressionability). MI is calculated over rolling buffers of recent history chunks, using Dirichlet priors and focusing on the change in MI over time. This encourages agents to both affect and be affected by the other. A variant testing only the "Influence" drive is also explored.
- Sub-Reaction Time Anticipation: The Influence and Impressionability reward is used, but a 2-timestep delay is introduced between observation and action for both agents. Successful coordination implies anticipation faster than simple reaction.

An extrinsic task is also set up where only Agent0 receives an external reward based on Agent1's location (top or bottom half of the space, depending on a private signal given to Agent0). Agent1 receives no external reward. This tests if the intrinsic drives can facilitate cooperation in asymmetric reward scenarios.

Results:

Artificial Curiosity: Agents did not consistently prefer interacting with each other. They found sufficient surprise interacting with their own object or the other's shadow.
Imitation/Being Imitated: Agents strongly preferred interacting with each other, developing stereotyped turn-taking or dance-like patterns.
Influence/Impressionability: Agents strongly preferred self-other interaction. The patterns were less stereotyped than pure imitation. Removing the "Impressionability" reward did not reduce social interaction preference in the purely intrinsic setting, as agents learned to take turns influencing/being influenced.
Sub-Reaction Time Anticipation: Agents still preferred self-other interaction despite the action delay, learning coordinated movements.
Extrinsic Task: Agents with the combined Influence and Impressionability reward successfully cooperated. Agent0 learned to use intrinsic interaction rewards to guide Agent1 to the correct location. Without intrinsic rewards, or with only the "Influence" reward (lacking "Impressionability"), cooperation failed or was significantly impaired, especially in this asymmetric setup where Agent1 needed to be receptive to Agent0's signaling.

Conclusion:

The paper concludes that intrinsic motivations specifically rewarding mutual understanding and coordination (like imitation or mutual influence/impressionability) are effective at driving agents to prefer social interaction in the PCP, unlike artificial curiosity alone. These drives, reflecting a "wanting to be understood," depend on reciprocal interaction only possible with similarly motivated partners. This intrinsic drive can also bootstrap cooperation in tasks with asymmetric extrinsic rewards, allowing one agent to intrinsically incentivize another's behavior for mutual (or one-sided) benefit. Future work includes generalizing policies, exploring more sophisticated MI calculations, testing with humans, and investigating secondary intersubjectivity.