Adaptive dynamics of alternating Prisoner's Dilemma with memory N
Published 26 Jan 2026 in math.DS | (2601.18671v1)
Abstract: The Prisoner's Dilemma is used as a model in processes involving reciprocity; however, its classical setup can be insufficient in settings where the symmetry of the simultaneous decision making is broken -- for example, in donor and recipient processes. In the alternating Prisoner's Dilemma model the two players take turns choosing their strategy. Assuming a finite memory setup, we establish the mathematical aspects of the adaptive dynamics of the alternating Prisoner's Dilemma, paying particular attention to the case of memory 1.
The paper introduces a recursive formulation of the Markov transition matrix and payoff vector for the alternating Prisoner’s Dilemma with finite memory.
It rigorously analyzes the memory-1 case, reducing a four-dimensional system to two-dimensional toroidal flows with explicit invariant quantities.
It establishes necessary conditions for equilibria, showing that dynamics are confined to quasi-periodic or degenerate attractors in sequential interactions.
Adaptive Dynamics of the Alternating Prisoner's Dilemma with Memory N
Introduction and Context
The alternating Prisoner's Dilemma (APD) generalizes the classical repeated Prisoner's Dilemma by introducing asymmetry: players alternate in making their choices, which captures a broad set of real-world reciprocity scenarios not accurately modeled by simultaneous action. Examples arise in biological donations, such as food sharing in vampire bats or alternating cooperative behaviors in various animal species. This framework incorporates finite memory, where each player's strategy is predicated on their recollection of previous interactions, parameterized by memory length N.
This work systematically develops the mathematical structure underlying the adaptive dynamics of the APD for finite memory, with detailed analysis for the nontrivial memory-1 case. It presents recursive constructions for both the Markov transition matrix and the corresponding payoff vector, clarifies system symmetries, and offers a geometric reduction of the associated dynamical system.
Mathematical Formulation
The population model assumes asexual reproduction with resident strategies parameterized by x, subject to invasion by mutant strategies y. The invasion fitness A(y,x) determines whether mutants can establish, with adaptive dynamics given by:
y˙=∂y∂A(y,x)y=x
In this APD setting, strategies are encoded as vectors of conditional cooperation probabilities, p,q∈[0,1]2N, responding to the distinct (asymmetric) memory structure induced by alternate moves. The state of the game is characterized by the joint history of actions over the last N rounds, leading to a 22N-dimensional Markov chain with transition matrix MN(p,q). The payoff function is the expected reward under the stationary distribution of this chain.
The core mathematical advances include:
Explicit recursive construction of MN(p,q) and fN:
The transition structure exploits the staggered information sets for leader (first mover) and follower (second mover), leading to a nontrivially block-structured matrix, distinct from the simultaneous case.
Determinantal expression for invasion fitness:
A(p,q) is written explicitly in terms of determinants involving modified versions of MN and the payoff vector fN.
Symmetry Structure
While the APD lacks the full suite of discrete Z2×Z2 symmetries present in the simultaneous game due to its directional asymmetry, a residual Z2 subgroup persists, represented by permutation matrices corresponding to flipping the meaning of cooperation/defection for either player. These symmetries manifest in the invariance properties of both the transition matrices and the adaptive dynamic vector field.
Memory-1 Case: Dynamical Reduction
A rigorous analytical and numerical investigation is conducted for the minimal memory N=1 case, revealing several surprising structural features.
Invariant Quantities: The flow admits two nontrivial invariants, F1=(p1−1)2+p32 and F2=(p2−1)2+p42, independent of payoff parametrization.
Phase Space Foliation: The dynamics reduce from four dimensions to two, taking place on tori T2, the level sets defined by F1,F2. Thus, the system is globally bounded, and the relevant dynamics can be analyzed via reduced angular variables on the torus.
Figure 1: Vector field on a torus for various values of c, C1, and C2. Red curves are zeros of the denominator of the reduced vector field equations; the embedded rectangle marks the image of the [0,1]4 cube.
Equilibria: Analysis identifies a two-dimensional manifold of equilibria within [0,1]4, with explicit algebraic and trigonometric characterizations. Detailed computation of the Jacobian reveals that these are generically degenerate saddles or sources (never sinks).
Angular Parameterization: Using the invariants, the system is reduced via
and vector field equations are derived for (ϕ,ψ) describing the dynamics on the torus, with careful accounting for intersection with the valid probability region.
Equilibrium Existence: Necessary and sufficient conditions are derived for equilibria to intersect the physical region, yielding a maximum of four internal equilibria for generic parameter choices.
Degenerate Tori: In the limit C1→0 (or C2→0), the invariant tori collapse, producing slow-fast dynamics localized near circular trajectories; in this limit, motion becomes generically aperiodic.
Practical and Theoretical Implications
This paper illuminates qualitative and quantitative divergences between simultaneous and alternating repeated games with limited memory. The existence of explicit global invariants in the APD with memory 1, and the resulting global foliation into invariant tori, is an unexpected feature not shared with the simultaneous case. This geometric structure imposes severe constraints on the possible evolutionary dynamics—ruling out chaos and confining behavior to quasiperiodic or degenerate (saddle/source) attractors in the generic memory-1 APD.
Theoretically, the recursive tools provided here extend to arbitrary finite memory, paving the way for more comprehensive analyses of direct reciprocity mechanisms in biological and engineered populations where interactions are sequential.
Practically, understanding the constrained dynamics of populations using finite-memory, alternating strategies may inform the design of robust decentralized systems and the interpretation of empirical cooperation data in biological and socioeconomic networks.
Conclusion
The adaptive dynamics of the alternating Prisoner's Dilemma with memory-N are governed by a rich and structurally distinct mathematical framework compared to the simultaneous-action case. The explicit recursive constructions for transition and payoff, analysis of admissible symmetries, and, crucially, the global reduction of the memory-1 dynamics to toroidal flows with algebraically tractable equilibria, collectively provide a foundation for analyzing sequential direct reciprocity under realistic cognitive constraints. The approach and results are extensible to more complicated evolutionary scenarios, with future work likely to focus on the high-memory limit and corresponding emergent behaviors.
Reference:
"Adaptive dynamics of alternating Prisoner's Dilemma with memory N" (2601.18671)