GigaEvo: Hybrid LLM & Evolution Framework
- GigaEvo is an open-source framework that integrates large language models with evolutionary algorithms to advance hybrid optimization methodologies.
- It features a modular architecture with Redis-based data storage, an asynchronous DAG execution engine, and MAP-Elites quality-diversity search for robust solution exploration.
- The system enables reproducible experiments in mathematical discovery and program synthesis, achieving state-of-the-art results in benchmark tasks.
GigaEvo is an extensible open-source research framework designed to support and advance hybrid optimization methodologies that integrate LLMs with evolutionary algorithms, specifically drawing on insights from LLM-guided evolution exemplified by AlphaEvolve (Khrulkov et al., 17 Nov 2025). It provides robust modular implementations of quality-diversity search (MAP-Elites), asynchronous DAG-based evaluation pipelines, LLM-driven mutation operators with structured insight generation, bidirectional lineage tracking, and multi-island evolutionary strategies. The system is engineered to facilitate reproducibility, modularity, and rapid experimentation in mathematical discovery and program synthesis tasks.
1. System Architecture and Modular Design
GigaEvo organizes its functionality into four principal modules:
- Redis Database: Each evolutionary Program (individual) is serialized as a record with a unique UUID, source code, lifecycle state (FRESH, RUNNING, COMPLETE, DISCARDED), evaluation metrics (fitness, validity flag, complexity, etc.), lineage data (parent and child IDs), and outputs from each execution stage (including traces, exception stacks, insights, and lineage analyses).
- DAG Execution Engine: Implements an asyncio-based pipeline—referred to as the DAGAutomata scheduler—that processes each Program through a parametrizable DAG of stages such as execution, validation, complexity analysis, insight generation, and prompt construction. It features both inter-Program and intra-Program parallelism, execution-order and data-flow edge types, optimistic concurrency control (via Redis atomic counters), stage caching, and failure isolation.
- Evolution Engine: An asynchronous event loop supporting both single-island and multi-island evolutionary processes. Each island maintains its own MAP-Elites archive over distinct (potentially heterogeneous) behavioral spaces. The engine handles elite selection, mutation, archive updating, and periodic migration of top performers across islands.
- Mutation Operator: A LangGraph-based agent constructs structured mutation prompts, invokes one or more LLMs to perform diff-based or rewrite-based mutations, and parses resultant offspring code back into the system. The operator leverages detailed mutation contexts comprising metrics, insights, and lineage analyses.
2. Algorithmic Components
2.1 MAP-Elites Quality-Diversity Search
Let denote all candidate Programs, each with:
- Fitness : The scalar objective function (e.g., minimum triangle area, circle-sum).
- Behavior descriptor : For example, is the fitness (bounded), and is either a validity indicator () or a complexity metric.
GigaEvo discretizes both behavioral axes into bins. Each Program maps to a cell
The MAP-Elites archive stores the elite for each cell . On evaluation, a Program replaces the cell occupant if it is superior:
Parent selection for mutation is fitness-proportional:
2.2 LLM-Driven Mutation and Insight Generation
Mutation operators are driven by LLMs that consume a mutation context containing the natural language task description, parent code(s), evaluation metrics, structured insights, and lineage analysis. Prompt construction supports two primary mutation modes: diff-based (LLM produces a code patch) and rewrite-based (LLM rewrites functions). Insight generation invites LLMs to annotate Programs with feedback, categorized by type (algorithmic, structural), effect (beneficial, harmful), and severity (low, medium, high). Bidirectional lineage tracking enables mutation prompts to incorporate not only code changes but also metric trajectories across generations. Multiple LLMs may be routed for specialized roles (e.g., Qwen for geometric tasks, Gemini for discrete tasks).
3. Asynchronous DAG-Based Evaluation Pipeline
The DAGAutomata engine orchestrates an evaluation pipeline where each Program traverses a directed acyclic graph of stages. Two levels of concurrency are realized:
- Inter-Program: Multiple Programs execute through the DAG concurrently.
- Intra-Program: Independent stages within a single Program's pipeline can be parallelized when their data dependencies are met.
Stages are connected via data-flow (transferring outputs) or execution-order (enforcing sequential constraints) edges. The pipeline implements optimistic concurrency control, stage-level output caching (enabling skip-if-unmodified execution), and failure isolation. A cascading validation approach first applies inexpensive syntactic checks, promoting only promising candidates to more resource-intensive geometric or integer validation.
4. Experimental Benchmarks and Empirical Outcomes
GigaEvo has been validated against established optimization problems originating from AlphaEvolve:
- Heilbronn Triangle (n=11): Maximize the minimum triangle area among interior points inside a unit-area triangle.
- Circle Packing (n=26,32): Maximize the cumulative radius sum of non-overlapping circles in a unit square.
- Kissing Number (n=12 target): Maximize integer vectors on a sphere of fixed radius with a prescribed minimum separation.
Results, as summarized below, demonstrate reproducibility and, in some cases, advancement over previous state-of-the-art (SOTA):
| Problem | Target (AlphaEvolve) | GigaEvo Best | Notes |
|---|---|---|---|
| Heilbronn | 0.0365 | 0.0364 | Visually identical configuration |
| Circle | 2.635 | 2.63598 | Slight improvement |
| Circle | 2.937 | 2.939 | Outperforms prior SOTA |
| Kissing | Reached known bound, no new record |
For single-island searches, 75–90% of discretized MAP-Elites cells were occupied on the Heilbronn and Circle tasks, indicating effective coverage. Convergence was characterized by rapid initial improvement within the first five generations and subsequent incremental refinements. Additional experiments recovered FunSearch results and improved on non-uniform bin-packing; prompt and agent AUCs increased substantially through evolution rounds (Khrulkov et al., 17 Nov 2025).
5. Software Engineering and Extensibility
GigaEvo is implemented in Python 3.10+, utilizing asyncio and aioredis for concurrency, Hydra for configuration, LangGraph for prompt orchestration, and Redis for data storage with optimistic concurrency. DAG stages, evolutionary engines, and mutation operators are modularized with abstract base classes, enabling straightforward extension. Declarative configuration supports YAML-based overrides and command-line parameterization, facilitating rapid prototyping and reproducibility. Each run records comprehensive logs—including configuration snapshots, archive dumps, random and LLM seeds—to ensure bit-identical reproducibility on replay.
User-defined problems follow a prescribed directory structure, with artifacts such as task_description.txt, metrics.yaml, validate.py, and initialization scripts. Monitoring and control are supported via CLI dashboards and Redis GUI plugins. The system includes plugin interfaces for novel DAG stages, problem definitions, and LLM-backed mutation strategies.
6. Usage, Reproducibility, and Future Directions
Installation is via standard Git and pip workflows, while experiment definition relies on problem directory creation and YAML registration. The experiment workflow encourages explicit logging of random seeds and LLM temperature to guarantee reproducibility.
Potential extensions include integrating continuous local search methods (such as CMA-ES) as downstream DAG stages, supporting multifile or multilingual program synthesis, experimenting with alternative or higher-dimensional behavior spaces, plugging in new LLMs (such as PaLM or LLaMA 3), and meta-evolution for automated behavior space design (Khrulkov et al., 17 Nov 2025).
With its modular and concurrent architecture, asynchronous DAG pipelines, and declarative experimental configuration, GigaEvo supports reproducible, extensible research into LLM-driven evolutionary processes—from mathematical construction tasks to prompt and agent evolution for NLP settings.