Inequity Aversion as a Mechanism for Cooperative Behavior in Intertemporal Social Dilemmas
The paper examines the role of inequity aversion in fostering cooperation within intertemporal social dilemmas, expanding upon classical models of social behavior and offering insights into multi-agent dynamics. It introduces a novel framework by incorporating inequity aversion into multi-agent reinforcement learning (MARL), specifically adapting Fehr and Schmidt’s model to Markov games, a more complex and realistic extension for analyzing social dilemmas. This approach is compared with existing commitment to rational choice theory and behavioral economic models.
The research explicitly focuses on two primary behavioral responses to inequity: advantageous inequity aversion, akin to a sense of guilt, and disadvantageous inequity aversion, akin to envy. These are considered within the context of partially observable Markov games involving multiple agents. The authors identify two critical factors affecting cooperation: collective action, necessary to achieve socially optimal equilibria, and temporal credit assignment, which requires agents to associate short-term actions with long-term consequences.
Two distinct games, Cleanup (a public goods dilemma) and Harvest (a commons dilemma), are structured as partially observable Markov games, and serve as testing grounds for the model. In Cleanup, agent teams must manage collective resources to maintain favorable conditions (akin to maintaining a clean aquifer for apple regrowth), while in Harvest, agents balance resource consumption to ensure sustainability. These games are designed to highlight the tensions between individual and collective incentives, characteristic of intertemporal social dilemmas.
The experiments reveal differentiation in the effectiveness of advantageous versus disadvantageous inequity aversion across game types. Results demonstrate that advantageous inequity aversion facilitates cooperation, especially in public goods settings like Cleanup, where agents must contribute actively to a shared resource pool. This form of social preference improves temporal credit assignment by providing intrinsic rewards aligned with long-term group benefits. However, such agents need to constitute a large portion of the population for cooperative dynamics to emerge.
Conversely, disadvantageous inequity aversion is particularly impactful in the Harvest game due to its emphasis on punishing defectors. It effectively transforms the game's payoff structure, discouraging free-riding through timely punishment by aggrieved agents. Interestingly, these agents do not need to dominate the population quantitatively for cooperation to emerge, indicating the potential of this mechanism in environments where enforcement of prosocial norms is feasible.
Despite the promising results, the paper outlines limitations such as potential exploitation of inequity-averse agents and the practical challenge of determining the appropriate population mix in heterogeneous settings. The exploration of evolved intrinsic rewards and hybrid strategies that could provide more robust cooperation mechanisms in stochastic or large-population games are suggested as future avenues of research.
This research contributes importantly by providing empirical evidence and theoretical justification for inequity aversion as a means to enhance cooperation. The findings stress that integrating social preferences into reinforcement learning algorithms could increase the effectiveness of AI in multi-agent interactions, harmonizing individual strategies with group-level success. The work bridges the human behavioral studies and technologically simulated environments, furthering understanding of how complex cooperative behavior could evolve in artificial systems and aligning them closer to human-like negotiations in social dilemmas. The paper thus expands not only the foundational understanding of cooperative mechanism but also proposes a potentially scalable framework for real-world applications in distributed artificial intelligence and beyond.