Papers
Topics
Authors
Recent
Search
2000 character limit reached

GigaEvo: Hybrid LLM & Evolution Framework

Updated 7 February 2026
  • GigaEvo is an open-source framework that integrates large language models with evolutionary algorithms to advance hybrid optimization methodologies.
  • It features a modular architecture with Redis-based data storage, an asynchronous DAG execution engine, and MAP-Elites quality-diversity search for robust solution exploration.
  • The system enables reproducible experiments in mathematical discovery and program synthesis, achieving state-of-the-art results in benchmark tasks.

GigaEvo is an extensible open-source research framework designed to support and advance hybrid optimization methodologies that integrate LLMs with evolutionary algorithms, specifically drawing on insights from LLM-guided evolution exemplified by AlphaEvolve (Khrulkov et al., 17 Nov 2025). It provides robust modular implementations of quality-diversity search (MAP-Elites), asynchronous DAG-based evaluation pipelines, LLM-driven mutation operators with structured insight generation, bidirectional lineage tracking, and multi-island evolutionary strategies. The system is engineered to facilitate reproducibility, modularity, and rapid experimentation in mathematical discovery and program synthesis tasks.

1. System Architecture and Modular Design

GigaEvo organizes its functionality into four principal modules:

  • Redis Database: Each evolutionary Program (individual) is serialized as a record with a unique UUID, source code, lifecycle state (FRESH, RUNNING, COMPLETE, DISCARDED), evaluation metrics (fitness, validity flag, complexity, etc.), lineage data (parent and child IDs), and outputs from each execution stage (including traces, exception stacks, insights, and lineage analyses).
  • DAG Execution Engine: Implements an asyncio-based pipeline—referred to as the DAGAutomata scheduler—that processes each Program through a parametrizable DAG of stages such as execution, validation, complexity analysis, insight generation, and prompt construction. It features both inter-Program and intra-Program parallelism, execution-order and data-flow edge types, optimistic concurrency control (via Redis atomic counters), stage caching, and failure isolation.
  • Evolution Engine: An asynchronous event loop supporting both single-island and multi-island evolutionary processes. Each island maintains its own MAP-Elites archive over distinct (potentially heterogeneous) behavioral spaces. The engine handles elite selection, mutation, archive updating, and periodic migration of top performers across islands.
  • Mutation Operator: A LangGraph-based agent constructs structured mutation prompts, invokes one or more LLMs to perform diff-based or rewrite-based mutations, and parses resultant offspring code back into the system. The operator leverages detailed mutation contexts comprising metrics, insights, and lineage analyses.

2. Algorithmic Components

Let XX denote all candidate Programs, each with:

  • Fitness f(x)f(x): The scalar objective function (e.g., minimum triangle area, circle-sum).
  • Behavior descriptor b(x)R2\mathbf{b}(x)\in\mathbb{R}^2: For example, b1(x)b_1(x) is the fitness (bounded), and b2(x)b_2(x) is either a validity indicator ({0,1}\in\{0,1\}) or a complexity metric.

GigaEvo discretizes both behavioral axes into bins. Each Program xx maps to a cell

c(x)=(b1(x)fminΔ1,b2(x)vminΔ2)c(x) = \left(\left\lfloor \frac{b_1(x) - f_{\min}}{\Delta_1} \right\rfloor, \left\lfloor \frac{b_2(x) - v_{\min}}{\Delta_2} \right\rfloor\right)

The MAP-Elites archive A[c]A[c] stores the elite for each cell cc. On evaluation, a Program xx replaces the cell occupant if it is superior:

A[c(x)]{x,if A[c(x)] is empty or f(x)>f(A[c(x)]) A[c(x)],otherwiseA[c(x)] \leftarrow \begin{cases} x, & \text{if } A[c(x)]\text{ is empty or } f(x) > f(A[c(x)]) \ A[c(x)], & \text{otherwise} \end{cases}

Parent selection for mutation is fitness-proportional:

Pr[x selected]=f(x)αyEf(y)α,α=1\Pr[x\text{ selected}] = \frac{f(x)^\alpha}{\sum_{y\in E} f(y)^\alpha},\quad \alpha=1

2.2 LLM-Driven Mutation and Insight Generation

Mutation operators are driven by LLMs that consume a mutation context containing the natural language task description, parent code(s), evaluation metrics, structured insights, and lineage analysis. Prompt construction supports two primary mutation modes: diff-based (LLM produces a code patch) and rewrite-based (LLM rewrites functions). Insight generation invites LLMs to annotate Programs with feedback, categorized by type (algorithmic, structural), effect (beneficial, harmful), and severity (low, medium, high). Bidirectional lineage tracking enables mutation prompts to incorporate not only code changes but also metric trajectories across generations. Multiple LLMs may be routed for specialized roles (e.g., Qwen for geometric tasks, Gemini for discrete tasks).

3. Asynchronous DAG-Based Evaluation Pipeline

The DAGAutomata engine orchestrates an evaluation pipeline where each Program traverses a directed acyclic graph of stages. Two levels of concurrency are realized:

  • Inter-Program: Multiple Programs execute through the DAG concurrently.
  • Intra-Program: Independent stages within a single Program's pipeline can be parallelized when their data dependencies are met.

Stages are connected via data-flow (transferring outputs) or execution-order (enforcing sequential constraints) edges. The pipeline implements optimistic concurrency control, stage-level output caching (enabling skip-if-unmodified execution), and failure isolation. A cascading validation approach first applies inexpensive syntactic checks, promoting only promising candidates to more resource-intensive geometric or integer validation.

4. Experimental Benchmarks and Empirical Outcomes

GigaEvo has been validated against established optimization problems originating from AlphaEvolve:

  • Heilbronn Triangle (n=11): Maximize the minimum triangle area among nn interior points inside a unit-area triangle.
  • Circle Packing (n=26,32): Maximize the cumulative radius sum of nn non-overlapping circles in a unit square.
  • Kissing Number (n=12 target): Maximize NN integer vectors on a sphere of fixed radius with a prescribed minimum separation.

Results, as summarized below, demonstrate reproducibility and, in some cases, advancement over previous state-of-the-art (SOTA):

Problem Target (AlphaEvolve) GigaEvo Best Notes
Heilbronn 0.0365 0.0364 Visually identical configuration
Circle n=26n=26 2.635 2.63598 Slight improvement
Circle n=32n=32 2.937 2.939 Outperforms prior SOTA
Kissing (n=12)(n=12) N840N\ge840 N=840N=840 Reached known bound, no new record

For single-island searches, 75–90% of discretized MAP-Elites cells were occupied on the Heilbronn and Circle tasks, indicating effective coverage. Convergence was characterized by rapid initial improvement within the first five generations and subsequent incremental refinements. Additional experiments recovered FunSearch results and improved on non-uniform bin-packing; prompt and agent AUCs increased substantially through evolution rounds (Khrulkov et al., 17 Nov 2025).

5. Software Engineering and Extensibility

GigaEvo is implemented in Python 3.10+, utilizing asyncio and aioredis for concurrency, Hydra for configuration, LangGraph for prompt orchestration, and Redis for data storage with optimistic concurrency. DAG stages, evolutionary engines, and mutation operators are modularized with abstract base classes, enabling straightforward extension. Declarative configuration supports YAML-based overrides and command-line parameterization, facilitating rapid prototyping and reproducibility. Each run records comprehensive logs—including configuration snapshots, archive dumps, random and LLM seeds—to ensure bit-identical reproducibility on replay.

User-defined problems follow a prescribed directory structure, with artifacts such as task_description.txt, metrics.yaml, validate.py, and initialization scripts. Monitoring and control are supported via CLI dashboards and Redis GUI plugins. The system includes plugin interfaces for novel DAG stages, problem definitions, and LLM-backed mutation strategies.

6. Usage, Reproducibility, and Future Directions

Installation is via standard Git and pip workflows, while experiment definition relies on problem directory creation and YAML registration. The experiment workflow encourages explicit logging of random seeds and LLM temperature to guarantee reproducibility.

Potential extensions include integrating continuous local search methods (such as CMA-ES) as downstream DAG stages, supporting multifile or multilingual program synthesis, experimenting with alternative or higher-dimensional behavior spaces, plugging in new LLMs (such as PaLM or LLaMA 3), and meta-evolution for automated behavior space design (Khrulkov et al., 17 Nov 2025).

With its modular and concurrent architecture, asynchronous DAG pipelines, and declarative experimental configuration, GigaEvo supports reproducible, extensible research into LLM-driven evolutionary processes—from mathematical construction tasks to prompt and agent evolution for NLP settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to GigaEvo.