Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
94 tokens/sec
Gemini 2.5 Pro Premium
55 tokens/sec
GPT-5 Medium
18 tokens/sec
GPT-5 High Premium
24 tokens/sec
GPT-4o
103 tokens/sec
DeepSeek R1 via Azure Premium
93 tokens/sec
GPT OSS 120B via Groq Premium
462 tokens/sec
Kimi K2 via Groq Premium
254 tokens/sec
2000 character limit reached

Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas (2305.00768v1)

Published 1 May 2023 in cs.MA and stat.ML

Abstract: In social psychology, Social Value Orientation (SVO) describes an individual's propensity to allocate resources between themself and others. In reinforcement learning, SVO has been instantiated as an intrinsic motivation that remaps an agent's rewards based on particular target distributions of group reward. Prior studies show that groups of agents endowed with heterogeneous SVO learn diverse policies in settings that resemble the incentive structure of Prisoner's dilemma. Our work extends this body of results and demonstrates that (1) heterogeneous SVO leads to meaningfully diverse policies across a range of incentive structures in sequential social dilemmas, as measured by task-specific diversity metrics; and (2) learning a best response to such policy diversity leads to better zero-shot generalization in some situations. We show that these best-response agents learn policies that are conditioned on their co-players, which we posit is the reason for improved zero-shot generalization results.

Citations (2)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.