CodeEvolve: An open source evolutionary coding agent for algorithm discovery and optimization (2510.14150v1)

Published 15 Oct 2025 in cs.AI, cs.LG, and cs.NE

Abstract: In this work, we introduce CodeEvolve, an open-source evolutionary coding agent that unites LLMs with genetic algorithms to solve complex computational problems. Our framework adapts powerful evolutionary concepts to the LLM domain, building upon recent methods for generalized scientific discovery. CodeEvolve employs an island-based genetic algorithm to maintain population diversity and increase throughput, introduces a novel inspiration-based crossover mechanism that leverages the LLMs context window to combine features from successful solutions, and implements meta-prompting strategies for dynamic exploration of the solution space. We conduct a rigorous evaluation of CodeEvolve on a subset of the mathematical benchmarks used to evaluate Google DeepMind's closed-source AlphaEvolve. Our findings show that our method surpasses AlphaEvolve's performance on several challenging problems. To foster collaboration and accelerate progress, we release our complete framework as an open-source repository.

Summary

The paper introduces CodeEvolve, an open-source evolutionary coding agent that utilizes island-based genetic algorithms and meta-prompting to drive algorithm discovery.
The methodology employs a weighted LLM ensemble and inspiration-based crossover, demonstrating superior performance on geometric and optimization benchmarks.
Experimental results and ablation studies confirm that CodeEvolve’s integration of evolutionary operators and LLMs enables efficient, reproducible, and innovative code optimization.

CodeEvolve: An Open Source Evolutionary Coding Agent for Algorithm Discovery and Optimization

Introduction

The recent developments in AI have seen LLMs being used beyond their traditional role in natural language processing, now serving as transformative tools in the domain of scientific and algorithmic discovery. CodeEvolve introduces an open-source framework that leverages the synergy of LLMs and genetic algorithms, specifically designed for iterative optimization of code solutions. Drawing inspiration from Google DeepMind's AlphaEvolve, CodeEvolve uniquely integrates an island-based genetic algorithm with meta-prompting strategies, positing it as an effective open-source alternative for automated algorithm discovery.

Figure 1: Overview of CodeEvolve.

Methodology

Core Components

CodeEvolve operates using an island-based genetic algorithm where multiple populations evolve independently while occasionally exchanging top-performing individuals. This setup ensures both diversity and high-throughput evaluation of solutions.

Evolutionary Operators

The system utilizes depth exploitation and meta-prompting exploration strategies. Depth exploitation focuses on refining existing high-performing solutions utilizing a rank-based selection mechanism, providing a depth-wise context to the LLM Ensemble for incremental improvement. Meanwhile, exploration involves meta-prompting, whereby the creation of richly contextual prompts facilitates discovery of innovative solutions.

Cross-Component Integration

A novel inspiration-based crossover mechanism is implemented, facilitating semantic exchanges between high-quality solutions, offering a robust alternative to conventional code splicing, intrinsically managed by the LLM.

LLM Ensemble

The LLM Ensemble forms the computational core for solution generation and modification. By employing a weighted ensemble of the GEMINI 2.5 models, the framework strikes a balance between throughput and breakthrough capabilities.

Experimental Evaluation

CodeEvolve's efficacy was benchmarked against Google DeepMind’s AlphaEvolve on complex mathematical and geometric problems. Results indicate that CodeEvolve not only meets but exceeds AlphaEvolve's results on a variety of tasks, as evidenced by superior performance in the second autocorrelation inequality (P1), optimal placements for distance minimization (P2), and circle packing challenges (P3).

Table~1 illustrates the strong performance metrics across benchmark problems. The accompanying figures elucidate the optimal solutions discovered for geometric benchmarks.

Figure 2: Comparison of best placement of 16 2-dimensional points for problem P2.A.

Figure 3: Comparison of best placement of 14 3-dimensional points for problem P2.A.

Ablation Study

Extensive ablation studies highlight the distinctive contributions of CodeEvolve's components. The synergy between meta-prompting and inspiration-based crossover was crucial across a majority of problem configurations, underscoring the importance of these novel methodologies in achieving state-of-the-art results.

Figure 4: Ablations for both instances of the circle packing problem P3.

Figure 5: Ablations for problems P1 and P2.A.

Reproducibility and Open Source

CodeEvolve is fully open-source, with each experimental setup made verifiable through shared datasets and configurations. Despite constraints involving LLM APIs, the framework serves as a transparent platform for further community-driven innovations, highlighting its role in democratizing scientific research.

Conclusion

CodeEvolve emerges as a leading open-source framework for LLM-driven algorithmic evolution, advancing methodologies for automated program synthesis. Through robust integration of LLMs and genetic algorithms and through demonstrable empirical advancements beyond closed-source counterparts, it paves the way for enhanced collaborative AI-driven scientific discovery. Future developments will aim to further leverage the extensive LLM capabilities and broaden application domains, cementing CodeEvolve as a versatile agent in computational science advancements.