SeaEvo: Advancing Algorithm Discovery with Strategy Space Evolution

Published 27 Apr 2026 in cs.CL, cs.AI, and cs.NE | (2604.24372v1)

Abstract: LLM-guided evolutionary search has emerged as a promising paradigm for automated algorithm discovery, yet most systems track search progress primarily through executable programs and scalar fitness. Even when natural-language reflection is used, it is often used locally in mutation prompts or stored without an explicit population-level organization of strategic directions. As a result, evolutionary search can struggle to distinguish syntactically different implementations of the same idea, preserve lower-fitness but strategically promising directions, or detect when an entire family of strategies has saturated. We introduce \model, a modular strategy-space layer that elevates natural-language strategy descriptions from transient prompt context to first-class population-level evolutionary state in LLM-driven program search. \model augments each candidate program with an explicit natural language strategy description and uses this representation in three ways: Strategy Articulation turns mutation into a diagnose-direct-implement process; Stratified Experience Retrieval organizes the archive into strategy clusters and selects inspirations by behavioral complementarity; and Strategic Landscape Navigation periodically summarizes effective, saturated, and underexplored strategy families to guide future mutations. Across mathematical algorithm discovery, systems optimization, and agent-scaffold benchmarks, \model improves the underlying evolutionary backbones in most settings, with particularly large gains (21% relative improvement) on open-ended system optimization tasks. These results suggest that persistent strategy representations provide a practical mechanism for improving the robustness and efficiency of LLM-guided evolutionary search, suggesting a path toward compound AI systems that accumulate algorithmic knowledge over time.

Abstract PDF Upgrade to Chat

Authors (10)

Summary

The paper introduces SeaEvo, which uses persistent natural-language strategy descriptions to enhance evolutionary search.
The methodology integrates strategy articulation, stratified experience retrieval, and strategic landscape navigation to drive innovation.
Empirical results show SeaEvo outperforms baselines with up to 66% improvement in best scores and lower cumulative API costs.

SeaEvo: Toward Robust Algorithm Discovery via Persistent Strategy-Space Evolution

Motivation and Problem Statement

LLM-driven evolutionary search has rapidly matured as a paradigm for automated algorithm discovery, yielding state-of-the-art results in mathematical invention, engineering optimization, and agentic task design. Traditional frameworks predominantly monitor search progress through executable candidate programs and scalar fitness signals, thereby conflating distinct algorithmic strategies with mere code variants and failing to preserve strategically promising but initially low-fitness directions. This results in three persistent failures: ambiguity between syntactic variants of the same idea, premature suppression of weak but strategically diverse candidates, and myopia toward the saturation of whole families of strategies.

SeaEvo addresses this representational gap by instantiating persistent, population-level strategy descriptions—making semantic strategy space a first-class search object in evolutionary loops guided by LLMs. By explicitly embedding and organizing natural-language descriptions of each candidate's algorithmic intent and integrating them into mutation, retrieval, and landscape navigation, SeaEvo seeks to transform evolutionary archives from flat collections of code into navigable maps of evolving algorithmic innovation.

SeaEvo Architecture and Methodology

SeaEvo is a modular layer designed for extensibility, augmenting any LLM-driven evolutionary search framework. The cornerstone of SeaEvo is its dual-space archive: each candidate is represented by executable code, scalar fitness, and a persistent natural-language strategy description. SeaEvo comprises three coordinated modules:

Strategy Articulation (SA): Transforms mutation into a structured, explicit diagnose–direct–implement sequence. Each new candidate receives an LLM-generated diagnosis of prior failure modes, an explicit strategy direction informed by both local program context and landscape-level guidance, and finally, code realization of that strategy. The strategy description, not transient reflection, is embedded and archived, enabling downstream organization and reuse.
Stratified Experience Retrieval (SER): Clusters archive entries in semantic strategy space using k-means on embedded descriptions, enabling selection of behaviorally complementary inspirations. For tasks with instance-level evaluation, candidates are prioritized based on the Hamming diversity of their behavioral vectors relative to the parent; for scalar-only tasks, cluster diversity and random cross-cluster sampling are used. Importantly, only strategy descriptions and metrics are passed as context, reducing API cost and prompting the LLM to operate at the level of high-level algorithmic concepts.
Strategic Landscape Navigation (SLN): Periodically (every $\Delta$ generations) summarizes the population-level distribution of strategy families—identifying which directions are effective, saturated, and underexplored. SLN's guidance is fed into SA as constraints, steering mutations toward less-explored but plausible directions, and away from families where progress has plateaued.

All modules are governed by an $\varepsilon$ -greedy exploration ratio, ensuring stochastic fallback to base mutation methods and preventing premature overconstraint by the strategy representation.

Figure 1: SeaEvo architecture illustrating the separation and integration of SA, SER, and SLN in the evolutionary loop.

Empirical Evaluation and Ablations

SeaEvo was evaluated across a suite of mathematical optimization and system engineering tasks (Circle Packing, Heilbronn Triangles, MinMax Distance, Prism, TXN, EPLB, LLM-SQL) and agentic scaffold benchmarks (XSTest). Baseline comparisons included GEPA, OpenEvolve, ShinkaEvolve, and plug-in overlays on AdaEvolve backbones.

Summary of results:

SeaEvo consistently outperformed all baselines in average and best fitness metrics across nearly all task and backbone configurations.
Gains were most pronounced on open-ended system optimization tasks; e.g., on Prism, SeaEvo achieved a 66% improvement in Best score and a near-3× improvement in peak performance compared to its base backbone.
Convergence to SOTA solutions occurred earlier and at lower cumulative API cost.
Figure 2: Search trajectories on Circle Packing benchmarks showing faster solution discovery and reduced cumulative API cost for SeaEvo-augmented methods.

Ablation studies confirmed that while each module alone delivers moderate improvement, their synergy is crucial: SLN provides corrective signal, preventing structured mutation and diversity retrieval from locking onto prematurely saturated strategy families. Hyperparameter sweeps verified robustness to cluster count and SLN interval, with balanced exploration/exploitation ratios yielding optimal convergence speeds and fitness stability.

Visualization and Strategic Space Dynamics

To visualize the effect of SeaEvo on the search trajectory, strategy embeddings were projected via t-SNE. SeaEvo candidates consistently occupied novel regions of strategy space not densely explored by base evolutionary search. On tasks such as EPLB and TXN, high-scoring programs clustered together, indicating algorithmic innovation at the level of coherent strategy families rather than isolated syntactic mutants.

Figure 3: Strategy embedding spaces for EPLB and TXN, showing SeaEvo candidates forming dense, high-performing clusters in previously unoccupied strategic regions.

Case studies further corroborate SeaEvo's ability to drive structural breakthroughs. In Prism, landscape guidance enabled transition from local-search plateaus to large-neighborhood destroy-and-repair algorithms, resulting in a 3× jump in performance. In EPLB, progressive vectorization and global load balancing emerged via SLN-driven mutation, with performance improvements realized at each architectural refinement.

Figure 4: Strategy evolution timeline for EPLB demonstrating strategic shifts and corresponding fitness gains during SeaEvo search.

Figure 5: Strategy evolution timeline for Prism, illustrating breakthrough resulting from landscape-guided transition to destroy-and-repair algorithms.

Plug-and-Play Generality

SeaEvo's architecture is backbone-agnostic and plug-and-play. Augmentation of GEPA and AdaEvolve with SeaEvo delivered consistent improvement without architectural changes, demonstrating that the strategy-space layer can generalize across diverse evolutionary search frameworks.

Practical and Theoretical Implications

SeaEvo introduces persistent, semantic-level representations to algorithmic search, enabling both robust exploitation of promising directions and exploration of diverse strategy families. This persistent organization endows AI-driven evolution with compound knowledge accumulation, facilitating search in combinatorial landscapes where local fitness is a poor proxy for global progress. Practically, SeaEvo reduces API cost, accelerates convergence, and enhances robustness by preventing premature convergence and stagnant search dynamics.

Theoretically, SeaEvo calls attention to the role of explicit semantic representations in combinatorial algorithm discovery—a model that could inform future research on memory architectures, search-state abstractions, and long-horizon meta-learning for compound AI systems. As LLM-guided search becomes increasingly agentic and open-ended, the structuring of strategy space may be critical for the emergence of truly innovative, persistent algorithmic knowledge.

Conclusion

SeaEvo establishes a persistent, strategy-space-centric evolutionary search framework that outperforms fitness-driven baselines and augments multiple backbones. Its modular design and plug-in generality, combined with substantive empirical evidence, suggest that semantic and behavioral structuring of search archives is essential for robust, efficient algorithm discovery. Future directions include scaling to more complex domains, enriching the semantic clustering mechanism, and integrating long-horizon strategic memory into compound AI agent architectures.

Markdown Report Issue