Modeling the Formation of Social Conventions from Embodied Real-Time Interactions (1802.06108v3)

Published 16 Feb 2018 in cs.MA, cs.AI, cs.GT, q-bio.NC, and stat.ML

Abstract: What is the role of real-time control and learning in the formation of social conventions? To answer this question, we propose a computational model that matches human behavioral data in a social decision-making game that was analyzed both in discrete-time and continuous-time setups. Furthermore, unlike previous approaches, our model takes into account the role of sensorimotor control loops in embodied decision-making scenarios. For this purpose, we introduce the Control-based Reinforcement Learning (CRL) model. CRL is grounded in the Distributed Adaptive Control (DAC) theory of mind and brain, where low-level sensorimotor control is modulated through perceptual and behavioral learning in a layered structure. CRL follows these principles by implementing a feedback control loop handling the agent's reactive behaviors (pre-wired reflexes), along with an adaptive layer that uses reinforcement learning to maximize long-term reward. We test our model in a multi-agent game-theoretic task in which coordination must be achieved to find an optimal solution. We show that CRL is able to reach human-level performance on standard game-theoretic metrics such as efficiency in acquiring rewards and fairness in reward distribution.

Authors (5)

Ismael T. Freire (6 papers)
Xerxes D. Arsiwalla (30 papers)
Paul Verschure (10 papers)
Clement Moulin-Frier (3 papers)
Marti Sanchez-Fibla (2 papers)

Citations (13)

View on Semantic Scholar

Summary

Overview of Control-Based Reinforcement Learning in Social Convention Formation

The paper "Modeling the Formation of Social Conventions from Embodied Real-Time Interactions" presents an innovative approach toward understanding the formation of social conventions through a computational model that integrates real-time control and learning mechanisms. The paper introduces the Control-based Reinforcement Learning (CRL) model, which is grounded in the Distributed Adaptive Control (DAC) theory of cognitive architecture. This model emphasizes the importance of sensorimotor control loops in social decision-making, proposing that these elements are crucial in bridging lower-level reactive behaviors and higher-level strategic learning to achieve optimal coordination in a multi-agent setup.

Methodology and Key Components

The CRL model is tested within the framework of a multi-agent coordination game, notably the "Battle of the Exes," which simulates decision-making processes requiring both cooperation and competition. The method diverges from classical discrete-time approaches by employing continuous-time interactions, thereby allowing agents to react more dynamically to one another. The CRL model incorporates two distinct layers:

Reactive Layer: This layer is engineered to handle inherent sensorimotor tasks using predefined behaviors like reward seeking and collision avoidance. This is analogous to instinctual responses observed in biological entities, providing a foundational mechanism that bootstraps higher-level learning processes. The reactive layer essentially manages immediate reactions to environmental stimuli, crucial for within-round conflict resolution.
Adaptive Layer: Situated above the reactive layer, the adaptive component employs a model-free reinforcement learning algorithm. Specifically, an Actor-Critic Temporal Difference Learning approach is utilized, facilitating strategic learning across multiple interactions. This layer adjusts actions based on learned policies to maximize long-term rewards, essential for forming sustained social conventions and facilitating inter-agent cooperation.

Results and Human-Level Performance

Through simulations, the CRL model demonstrated human-like proficiency in relevant metrics such as efficiency, fairness, and stability, particularly in the dynamic version of the game. For instance, the model was able to match human performance in efficiency and fairness, showcasing how real-time interactive frameworks significantly enhance coordination outcomes. The results underscore the advantage of continuous information flow and dynamic responsiveness in reaching optimal social conventions, contrasting with slower adaptations in discrete-time paradigms.

Implications and Future Directions

The implications of this research extend both practically and theoretically within the field of artificial intelligence and social robotics. Practically, the ability to model and simulate complex social behavior in machines could improve interactive AI technologies, enhancing human-robot interaction environments. Theoretically, this paper enriches the understanding of how simple reactive mechanisms can coalesce with higher cognitive processes to form sophisticated social behaviors.

Future explorations could delve into expanding the model to incorporate social reasoning elements, such as intent prediction or deeper learning of social cues. Additionally, incorporating memory systems could better emulate human-like learning by facilitating the generalization of learned conventions across varied contexts. This could culminate in more versatile AI capable of seamless adaptation in complex social environments.

Conclusion

In conclusion, the paper provides a robust framework for understanding the emergence of social conventions through interconnected reactive and adaptive processes. The proposed CRL model serves as a significant step towards developing embodied AI systems capable of nuanced social interaction, paving the way for advancements in how machines understand and emulate human-like social behaviors.

PDF Markdown

Related Papers

Find Related Papers

YouTube

Show All Videos