Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Prosocial learning agents solve generalized Stag Hunts better than selfish ones (1709.02865v2)

Published 8 Sep 2017 in cs.AI and cs.GT

Abstract: Deep reinforcement learning has become an important paradigm for constructing agents that can enter complex multi-agent situations and improve their policies through experience. One commonly used technique is reactive training - applying standard RL methods while treating other agents as a part of the learner's environment. It is known that in general-sum games reactive training can lead groups of agents to converge to inefficient outcomes. We focus on one such class of environments: Stag Hunt games. Here agents either choose a risky cooperative policy (which leads to high payoffs if both choose it but low payoffs to an agent who attempts it alone) or a safe one (which leads to a safe payoff no matter what). We ask how we can change the learning rule of a single agent to improve its outcomes in Stag Hunts that include other reactive learners. We extend existing work on reward-shaping in multi-agent reinforcement learning and show that that making a single agent prosocial, that is, making them care about the rewards of their partners can increase the probability that groups converge to good outcomes. Thus, even if we control a single agent in a group making that agent prosocial can increase our agent's long-run payoff. We show experimentally that this result carries over to a variety of more complex environments with Stag Hunt-like dynamics including ones where agents must learn from raw input pixels.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Alexander Peysakhovich (22 papers)
  2. Adam Lerer (30 papers)
Citations (104)