Balancing Cooperativeness and Adaptiveness in the (Noisy) Iterated Prisoner's Dilemma (2303.03519v1)
Abstract: Ever since Axelrod's seminal work, tournaments served as the main benchmark for evaluating strategies in the Iterated Prisoner's Dilemma (IPD). In this work, we first introduce a strategy for the IPD which outperforms previous tournament champions when evaluated against the 239 strategies in the Axelrod library, at noise levels in the IPD ranging from 0% to 10%. The basic idea behind our strategy is to start playing a version of tit-for-tat which forgives unprovoked defections if their rate is not significantly above the noise level, while building a (memory-1) model of the opponent; then switch to a strategy which is optimally adapted to the model of the opponent. We then argue that the above strategy (like other prominent strategies) lacks a couple of desirable properties which are not well tested for by tournaments, but which will be relevant in other contexts: we want our strategy to be self-cooperating, i.e., cooperate with a clone with high probability, even at high noise levels; and we want it to be cooperation-inducing, i.e., optimal play against it should entail cooperating with high probability. We show that we can guarantee these properties, at a modest cost in tournament performance, by reverting from the strategy adapted to the opponent to the forgiving tit-for-tat strategy under suitable conditions
- Adrian Hutter (18 papers)