Papers
Topics
Authors
Recent
Search
2000 character limit reached

Evolving Many Worlds: Towards Open-Ended Discovery in Petri Dish NCA via Population-Based Training

Published 13 Apr 2026 in cs.NE, cs.AI, and cs.MA | (2604.11248v1)

Abstract: The generation of sustained, open-ended complexity from local interactions remains a fundamental challenge in artificial life. Differentiable multi-agent systems, such as Petri Dish Neural Cellular Automata (PD-NCA), exhibit rich self-organization driven purely by spatial competition; however, they are highly sensitive to hyperparameters and frequently collapse into uninteresting patterns and dynamics, such as frozen equilibria or structureless noise. In this paper, we introduce PBT-NCA, a meta-evolutionary algorithm that evolves a population of PD-NCAs subject to a composite objective that rewards both historical behavioral novelty and contemporary visual diversity. Driven by this continuous evolutionary pressure, PBT-NCA spontaneously generates a plethora of emergent lifelike phenomena over extended horizons-a hallmark of true open-endedness. Strikingly, the substrate autonomously discovers diverse morphological survival and self-organization strategies. We observe highly regular, coordinated periodic waves; spore-like scattering where homogeneous groups eject cell-like clusters to colonize distant territories; and fluid, shape-shifting macro-structures that migrate across the substrate, maintaining stable outer boundaries that enclose highly active interiors. By actively penalizing monocultures and dead states, PBT-NCA sustains a state of effective complexity that is neither globally ordered nor globally random, operating persistently at the "edge of chaos".

Summary

  • The paper presents a novel population-based training framework for Petri Dish NCA that leverages composite novelty scores and exploit–explore cycles.
  • It demonstrates sustained emergence of lifelike behaviors, such as agentic competition and self-replication, through meta-evolutionary methods.
  • The framework outperforms fixed-parameter and random search baselines, suggesting significant advances in artificial life and computational evolution.

Toward Open-Ended Emergence: Population-Based Training for Petri Dish Neural Cellular Automata

Introduction

The paper "Evolving Many Worlds: Towards Open-Ended Discovery in Petri Dish NCA via Population-Based Training" (2604.11248) addresses the long-standing challenge of engineering open-ended complexity within artificial life systems. It introduces a Population-Based Training framework for Petri Dish Neural Cellular Automata (PBT-NCA), extending differentiable multi-agent cellular automata with a meta-evolutionary algorithm explicitly designed to sustain perpetual emergence of non-trivial behaviors. The work situates itself at the intersection of research in evolutionary biology, artificial life, and computational models of open-endedness, providing a framework that operates at the "edge of chaos"—the critical region where systems manifest high adaptability and continual structural innovation.

Methodology

Population-Based Training for PD-NCA

PBT-NCA orchestrates a population of differentiable multi-agent Petri Dish NCAs (PD-NCAs), each representing a world comprising several competitive neural cellular agents. Each agent (i.e., a neural network controlling a cellular "species") interacts on a global grid via local Moore neighborhood processing and an explicit mechanism for resource competition based on channel-wise attack and defense vectors. The system also introduces a static, random environment channel that serves as an evolutionary pressure against stagnation, preventing collapse into trivial fixed points.

Selection operates via a composite novelty-based score function: the behavioral archive-based novelty (computed from handcrafted ecological descriptors) and a population-level visual diversity score leveraging DINOv2 embeddings. This scoring eschews fixed objectives, instead preferring worlds that deviate most from historical and extant population behaviors. Exploit-explore cycles, inspired by Population-Based Training protocols, periodically introduce new individuals via elite selection, Lamarckian inheritance (stateful copying), hyperparameter crossover and mutation, and direct weight perturbation (Figure 1). Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1

Figure 1: Meta-iterative PBT-NCA pipeline, illustrating elitist exploit-explore exchanges via archive-informed selection, hyperparameter crossover, and parametric mutation.

Composite Novelty Score

The behavioral descriptor encodes time-series statistics (mean occupancy, temporal variability, agent turnover, winner-map entropy, alive-mass change) reflecting key ecological properties such as multi-species coexistence, stability of territorial boundaries, and dynamic transitions. Archive novelty is operationalized via k-Nearest Neighbors Euclidean distances to descriptors in a FIFO archive.

In parallel, the system computes a visual diversity component by embedding timesteps of each world with DINOv2, yielding a median cosine distance to the population at each timestep. This component captures spatio-morphological divergence that is not reflected in low-dimensional ecological statistics. Figure 2

Figure 2: Population-level evolution of composite score and the emergence of agentic competition dynamics captured over meta-iterations.

Exploit–Explore Dynamics

Every meta-iteration concludes with evaluation, archiving, and replacement events: the lowest-fitness worlds are replaced by mutated copies of elite parents, with stochastic hyperparameter crossover and injected noise. This exploits temporal performance while maintaining a search pressure for continual discovery.

Emergent Dynamics and Empirical Results

Representative Open-Ended Phenomena

Empirical exploration with populations of 3–7 NCA agents reveals a persistent emergence of complex, lifelike worlds:

  • Agentic Competition: Novel agent interaction motifs, including moving macrostructures, projectile emission, trail-based locomotion, and self-replicating entities, are observed without explicit programmatic scaffolding.
  • Self-Replication: Distinct cell clusters ejected by macroscopic entities colonize new territories, mimicking spatially distributed replicators (Figure 3). Figure 4

Figure 4

Figure 4

Figure 4

Figure 4

Figure 4

Figure 4

Figure 4

Figure 4

Figure 4: Morphological motifs—shooters, archipelago formation, and emergent ant-like locomotion—arising under competitive dynamics.

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3

Figure 3: Decentralized self-replication and colonization by coordinated emission of cell clusters from macroscopic entities.

  • Complex Pattern Formation: On larger or extended hyperparameter spaces, the framework uncovers "digital gliders," directional information waves, and persistent regular domains, paralleling classic CA discoveries (Figures 10, 11).
  • Ecological Persistence: Multi-agent coexistence is sustained across the vast majority of rollout frames, with high species entropy and stable effective complexity metrics, confirming operation at the edge of chaos. Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5

Figure 5: Sequential generative capacity for novel spatial entities (spirals, amoebas, archipelagos, "alien" motifs, gliders) over extended meta-iterations, exemplifying progressive open-endedness.

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6

Figure 6: Discovery of geometric, circuit-like structures with computational potential, reminiscent of discrete CA gliders and spaceships.

Figure 7

Figure 7

Figure 7: Quantitative tracking of ecological persistence and effective complexity—systems remain in dynamically rich, nontrivial regimes.

Quantitative Analysis

Long-term mean novelty and composite scores increase after initial transients, particularly as population and agent number scale upwards (Figure 8, top). This demonstrates that increasing system complexity supports sustained novelty accumulation. Intriguingly, hyperparameter adaptation converges toward regimes with higher learning rates and smaller batch sizes, aligning with induced gradient noise as a mechanism to avoid equilibrium collapse (Figure 8, bottom).

Comparison to Baselines

Baseline comparisons to fixed-parameter PD-NCA and random search configurations show markedly limited generative capacity. Fixed hyperparameters lead to high-entropy, structureless noise or monocultures, while random search occasionally finds cyclic local dynamics but rapidly converges to non-diverse equilibria. In contrast, PBT-NCA continuously discovers and maintains novel domains, confirming the necessity of population-based, novelty-driven meta-evolution for artificial open-endedness.

Implications and Future Directions

Mechanisms of Sustained Complexity

By leveraging competitive meta-evolution, nonstationary selection, and behavioral/visual novelty, PBT-NCA overcomes mode-collapse and triviality endemic to previous CA-based or gradient-based approaches. The method’s emergent regular domains, phase-separated macrostructures, and long-lived “digital organisms” suggest computational and biological phenomena—such as hypercycles, traveling localizations, and motility-induced phase separation—can be autonomously rediscovered and perpetually diversified.

The preservation of simultaneous order (spatial regularity, reproducibility) and structural variability directly parallels the criticality that supports computation in natural and artificial complex systems.

Theoretical and Practical Advances

  • Artificial Life and ALife Systems: PBT-NCA demonstrates sustained innovation without handcrafted objectives or reset interventions, supporting research into self-organizing complexity, evolutionary computation, and digital ecosystem modeling.
  • Computational Primitives: Discovery of mobile and interacting local structures opens direct avenues toward emergent symbolic manipulation and self-organized computational primitives.
  • Scalability and Transfer: The demonstrated scaling with number of agents and grid configurations suggests utility for large-scale, hardware-accelerated evolutionary discovery with co-evolving architectures, environments, and update rules.

Limitations

The current instantiation is limited to 2D differentiable substrates and may inherit anthropocentric bias from visual foundation models (DINOv2). Future work will target hardware-efficient frameworks (e.g., CAX), larger grid sizes, and evolutionary processes incorporating architecture/environment co-evolution and alternative novelty metrics.

Conclusion

PBT-NCA provides an extensible framework for open-ended discovery, positioning population-based meta-evolution as a practical engine for the continual emergence of lifelike, computational, and adaptive complexity in differentiable cellular substrates. By explicitly rewarding divergence from both history and contemporary population norms—and by leveraging scalable exploit-explore cycles—this method charts a concrete path toward constructing digital worlds that escape objective-driven stagnation and support indefinite innovation.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 8 tweets with 35 likes about this paper.