Generational Adversarial MAP-Elites (GAME)

Updated 3 February 2026

The paper introduces a coevolutionary framework that integrates MAP-Elites with adversarial quality-diversity, fostering arms race dynamics between opposing agents.
It employs tournament-informed task selection and advanced metrics (e.g., win rate, ELO, AQD-Score) to overcome limitations of standard QD methods.
Empirical results in domains like Pong and multi-agent games confirm significant improvements in both behavioral diversity and high-quality strategy evolution.

Generational Adversarial MAP-Elites (GAME) designates a family of quality-diversity (QD) algorithms in which MAP-Elites or its variants are embedded within an alternating, multi-generational adversarial coevolutionary framework. GAME extends classical QD illumination to adversarial domains—settings with two opposing sides where both the fitness and behavioral descriptor are defined by the specific interaction between paired agents—by iteratively coevolving both sides through generations of QD illumination, thereby supporting arms-race dynamics and the emergence of diverse, high-quality strategies. The most recent advances in GAME emphasize tournament-informed task selection and robust adversarial QD evaluation metrics, enabling fair comparison and improvement in both quality and diversity relative to behavioral or random clustering methods (Anne et al., 27 Jan 2026, Anne et al., 10 May 2025).

1. Adversarial QD Problem Formulation

GAME is designed for domains where two search spaces, $S_{\mathrm{Red}}$ and $S_{\mathrm{Blue}}$ , interact adversarially, coupled by the following mappings:

Joint fitness function: $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ defined by $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ , constrained such that $f_{\mathrm{red}} + f_{\mathrm{blue}} = 1$ .
Behavior descriptor: $B: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to \mathbb{R}^m$ , derived from the interaction (e.g., via video trace embedding), constitutes an $m$ -dimensional vector characterizing duel-level behaviors.

This coupling precludes direct application of conventional QD approaches—fitness and behavior are both contingent on the specific pairing, thus QD archives for one side cannot be meaningfully constructed without fixing the adversarial counterpart (Anne et al., 27 Jan 2026).

2. GAME Algorithm: Generational, Alternating, and Quality-Diversity

GAME adopts a generational two-level coevolutionary loop:

At each generation $g$ , one side (e.g., Red if $g$ is odd, Blue otherwise) is evolved while the other side’s solutions from the prior generation act as fixed "tasks".
Inner loop (illumination): For a given side, a multi-task, multi-behavior MAP-Elites (MTMB-ME) instance is instantiated, with one archive per opponent task, filling a limit of $M$ cells per task using a growing centroid-based discretization.
Task selection: After each generation, a new set of $S_{\mathrm{Blue}}$ 0 elites (tasks) is selected for the next generation (critical for evolutionary dynamics).

Pseudocode Schema (simplified)

$F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 9

This high-level strategy fosters ongoing arms races, open-ended exploration, and diverse adversarial behaviors (Anne et al., 10 May 2025).

3. Behavioral vs. Tournament-Informed Task Selection

The originally proposed GAME algorithm selects new tasks per generation by clustering all elites’ behavior vectors (across all tasks, ignoring origin) into $S_{\mathrm{Blue}}$ 1 clusters and choosing the top fitness elite from each. However, this approach exhibits critical limitations:

Task dependency: Behaviors are inherently task-dependent in adversarial settings; cross-task aggregation is semantically inconsistent.
Selection bias: The process is prone to overselecting solutions from "easy" tasks (i.e., those yielding high fitness without offering significant challenge or diversity).
Omission of adversarial outcomes: Task selection ignores cross-side adversarial performance, undermining the algorithm's ability to sustain arms race dynamics.

Tournament-informed selection remedies these deficiencies by leveraging full adversarial evaluation:

Ranking-based: Each candidate elite is evaluated against previous-generation tasks; their fitness vectors are transformed into ranking vectors, clustered (K-means), and selection is based on clusterwise maximal average fitness.
Pareto-front-based: The candidate elite’s fitness vectors versus prior-generation tasks are treated as multi-objective vectors, and $S_{\mathrm{Blue}}$ 2 non-dominated solutions are chosen (NSGA-III).

Tournament-based selection thus ensures that task sets for each generation are both challenging and diverse, directly incorporating multi-task adversarial performance into selection (Anne et al., 27 Jan 2026).

4. Adversarial Quality-Diversity Measures

Standard QD metrics are insufficient for adversarial domains due to side dependencies. GAME introduces six principled, tournament-based metrics suitable for comparing solution sets:

Win Rate: $S_{\mathrm{Blue}}$ 3 mean success rate against all $S_{\mathrm{Blue}}$ 4.
ELO Score: Maximum normalized ELO from a round-robin tournament over all paired solutions.
Robustness: For each $S_{\mathrm{Blue}}$ 5, minimal $S_{\mathrm{Blue}}$ 6 across $S_{\mathrm{Blue}}$ 7; report the maximum.
Coverage: Fraction of behavior-ranking clusters occupied by members of $S_{\mathrm{Blue}}$ 8.
Expertise: Minimum across $S_{\mathrm{Blue}}$ 9 of maximal $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 0 over $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 1 (worst-case best response).
AQD-Score: Cardinality of minimal counter-set from $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 2 ensuring that every $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 3 is defeated (i.e., has $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 4 for some $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 5).

This multi-faceted set of metrics facilitates fair, side-invariant comparison and captures diverse aspects of adversarial QD, such as strength, lack of exploitable weaknesses, and breadth of explored behaviors (Anne et al., 27 Jan 2026).

5. Implementation, Algorithmic Details, and Domains

All major GAME studies use high-dimensional neural or tree-structured controllers and adopt modern archive and evaluation machinery:

Controller architectures: For continuous control or visually-driven domains, MLPs (e.g., two hidden layers: 32 + 16 units) or behavior trees (with structured discrete variation: deletion, insertion, crossover, mutation).
Behavior embedding: CLIP-based vision encodings of rollout (video) traces yield $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 6, e.g., $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 7 frames, $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 8 for $F: S_{\mathrm{Red}} \times S_{\mathrm{Blue}} \to [0,1]^2$ 9.
Task/archive parameters: $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 0– $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 1, $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 2– $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 3 cells per archive, $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 4– $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 5 generations, $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 6 evaluations/generation.
Experimental domains: Pong (symmetric Atari-like), Cat-and-mouse (“Homicidal Chauffeur”), Pursuers-and-evaders, and Parabellum multi-agent games (symmetric armies with behavior-tree controllers).

Tournament-informed selection incurs significant extra evaluation cost per generation (on the order of $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 7 evaluations). Random and Behavior-only selection require no additional evals but yield suboptimal adversarial engagement (Anne et al., 27 Jan 2026, Anne et al., 10 May 2025).

6. Empirical Results and Evolutionary Dynamics

Across all tested domains, tournament-informed task selection (Ranking-based and Pareto-front-based) outperforms behavior-only and random selection on all relevant adversarial QD metrics (Win Rate, ELO, Expertise, AQD-Score), with statistically significant improvements in almost every scenario (Holm–Bonferroni, $F(s_{\mathrm{red}}, s_{\mathrm{blue}}) = \left(f_{\mathrm{red}}, f_{\mathrm{blue}}\right)$ 8). While Coverage is sometimes marginally higher for Random/Behavior-only, this reflects “spread” in behavior space occupied by easy or low-fitness solutions, not meaningful adversarial coverage. Robustness varies by domain, being near zero in fully symmetric environments such as Pong and slightly higher in more asymmetric games.

Observed evolutionary dynamics include:

Emergent arms races: Evolving strategies and counter-strategies, evidenced by shifting elite behaviors and response to adversarial pressure.
Open-endedness: Starting generations from scratch (“no bootstrap”) increases novelty but reduces long-term quality and coverage.
Role of neutral mutations: Explicitly preserving or pruning neutral mutations affects access to stepping-stone behaviors and ultimate archive quality.
High-dimensional descriptors: Use of VEM/CLIP embeddings for behavior descriptors enhances archive coverage and reduces variance, obviating the need for handcrafted features.

A plausible implication is that tournament-informed, generational adversarial QD coevolution provides a robust paradigm for open-ended discovery in adversarial environments, capturing both high quality and diversity under adversarial constraints (Anne et al., 27 Jan 2026, Anne et al., 10 May 2025).

7. Connections, Limitations, and Extensions

GAME generalizes several lines of research: it extends classical MAP-Elites (as found in generative and design applications (Fontaine et al., 2020)) to adversarial and multi-agent settings, and complements ideas from coevolutionary algorithms by integrating explicit QD objectives and multi-archive structure. Notably, real-world applications, especially in asymmetric or high-dimensional adversarial domains, may require adaptive or asymmetric task selection methods. Future research could explore larger policy classes, more complex adversarial objectives, and further diagnostics on archive structure and extinction dynamics.

References:

"Tournament Informed Adversarial Quality Diversity" (Anne et al., 27 Jan 2026)
"Adversarial Coevolutionary Illumination with Generational Adversarial MAP-Elites" (Anne et al., 10 May 2025)
"Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network" (Fontaine et al., 2020)

Markdown Report Issue Upgrade to Chat

References (3)

Tournament Informed Adversarial Quality Diversity (2026)

Adversarial Coevolutionary Illumination with Generational Adversarial MAP-Elites (2025)

Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Generational Adversarial MAP-Elites (GAME).

Generational Adversarial MAP-Elites (GAME)

1. Adversarial QD Problem Formulation

2. GAME Algorithm: Generational, Alternating, and Quality-Diversity

Pseudocode Schema (simplified)

3. Behavioral vs. Tournament-Informed Task Selection

4. Adversarial Quality-Diversity Measures

5. Implementation, Algorithmic Details, and Domains

6. Empirical Results and Evolutionary Dynamics

7. Connections, Limitations, and Extensions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Generational Adversarial MAP-Elites (GAME)

1. Adversarial QD Problem Formulation

2. GAME Algorithm: Generational, Alternating, and Quality-Diversity

Pseudocode Schema (simplified)

3. Behavioral vs. Tournament-Informed Task Selection

4. Adversarial Quality-Diversity Measures

5. Implementation, Algorithmic Details, and Domains

6. Empirical Results and Evolutionary Dynamics

7. Connections, Limitations, and Extensions

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research