Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 63 tok/s

Gemini 2.5 Pro 49 tok/s Pro

GPT-5 Medium 11 tok/s Pro

GPT-5 High 10 tok/s Pro

GPT-4o 83 tok/s Pro

Kimi K2 139 tok/s Pro

GPT OSS 120B 438 tok/s Pro

Claude Sonnet 4 38 tok/s Pro

2000 character limit reached

EvoGraph: Hybrid Directed Graph Evolution toward Software 3.0 (2508.05199v1)

Published 7 Aug 2025 in cs.SE and cs.AI

Abstract: We introduce EvoGraph, a framework that enables software systems to evolve their own source code, build pipelines, documentation, and tickets. EvoGraph represents every artefact in a typed directed graph, applies learned mutation operators driven by specialized small LLMs (SLMs), and selects survivors with a multi-objective fitness. On three benchmarks, EvoGraph fixes 83% of known security vulnerabilities, translates COBOL to Java with 93% functional equivalence (test verified), and maintains documentation freshness within two minutes. Experiments show a 40% latency reduction and a sevenfold drop in feature lead time compared with strong baselines. We extend our approach to evoGraph, leveraging language-specific SLMs for modernizing .NET, Lisp, CGI, ColdFusion, legacy Python, and C codebases, achieving 82-96% semantic equivalence across languages while reducing computational costs by 90% compared to LLMs. EvoGraph's design responds to empirical failure modes in legacy modernization, such as implicit contracts, performance preservation, and integration evolution. Our results suggest a practical path toward Software 3.0, where systems adapt continuously yet remain under measurable control.

Summary

The paper introduces EvoGraph, a framework that uses hybrid directed graph evolution and SLMs to autonomously modernize legacy software.
It demonstrates significant improvements with a 93% semantic equivalence in language translation and reduced latency and lead times over traditional methods.
Leveraging cost-effective SLMs, EvoGraph supports multi-language modernization while enforcing safety-aware, multi-objective optimization.

EvoGraph: Hybrid Directed Graph Evolution toward Software 3.0

The paper "EvoGraph: Hybrid Directed Graph Evolution toward Software 3.0" introduces EvoGraph, a sophisticated framework designed to autonomously evolve software systems, particularly focusing on legacy code modernization. The system leverages hybrid directed graph evolution, integrating small LLMs (SLMs) to apply mutations and assess fitness across various software artefacts. EvoGraph addresses critical challenges in legacy modernization, offering efficiency and semantic equivalency via cost-effective SLMs.

Framework Overview

EvoGraph capitalizes on directed graphs to represent and manage software artefacts, including source code, build pipelines, documentation, and other operational elements. Each node within the graph corresponds to a specific artefact type, such as code or documentation, while edges represent dependencies and relationships. This representation enables comprehensive management and modernization of legacy systems by capturing dynamic aspects like runtime traces, often overlooked by static analyzers.

Mutation operators like Weight Merge and Code Patch facilitate the evolution of these graphs, informed by expert systems for each respective language. EvoGraph's integration of safety-aware multi-objective selection mechanisms ensures that evolutionary changes not only improve system utility but also comply with safety and performance constraints (Figure 1).

Figure 1: EvoGraph system architecture showing the hybrid directed graph evolution process. Left: Artefact graph representation with typed nodes (code, docs, build, etc.) and mutation operators. Right: evoGraph specialization where language-specific SLMs handle targeted mutations for different legacy languages, coordinated by a central controller that maintains safety constraints and fitness optimization.

Experimental Validation

The framework was empirically validated against a variety of benchmarks representing real-world scenarios. EvoGraph demonstrated substantial improvements over existing methods, achieving a 93% functional equivalence in COBOL to Java translations and an 83% correction rate for security vulnerabilities on test systems. Moreover, it significantly reduced latency and feature lead times, outperforming traditional baselines such as hybrid automated program repair tools (Table 1).

Performance Metrics

Task	Metric	Copilot	APR	DGM	EvoGraph
OSS Bank CVEs fixed	/18 (up)	6	4	9	15
Tele Shop p95 latency	ms (down)	420	395	330	250
Lead time	h (down)	168	120	24	3
COBOL to Java equiv.	% (up)	--	--	71	93
Doc freshness BLEU	(up)	0.33	0.31	0.47	0.82

Table 1: Main experimental results comparing EvoGraph against baselines.

The performance metrics established EvoGraph's superior capability to manage multi-objective optimization tasks that are crucial for effective software modernization.

Implications of SLMs

A standout feature of EvoGraph is its use of SLMs, which provide a substantial reduction in computational costs without compromising performance. SLMs are specifically tuned for different legacy languages, capturing nuanced patterns and reducing the need for extensive computational resources. Compared to large models like GPT-4, SLMs offer an economical alternative for agentic AI applications, which is a crucial consideration for enterprises operating with budget constraints but requiring high-efficiency processing.

EvoGraph harnesses SLMs not only to streamline operations but also to facilitate rapid iteration and deployment flexibility, supporting on-premises modernization and edge computing solutions.

Multi-Language Modernization

EvoGraph's ability to support multi-language modernization is demonstrated through its capability to achieve high semantic equivalency across diverse language pairs. This flexibility highlights its potential for wider adoption in heterogeneous codebase environments.

Language Pair	LOC	Sem. Equiv.	Perf. Gain	Mod. Time
COBOL → Java	12,000	93%	2.1×	3h
.NET → Java	25,000	87%	1.5×	5h
Lisp → Python	8,000	91%	1.3×	2h

Table 2: Multi-language modernization results showing semantic equivalence, performance gains, and modernization times across different language pairs.

The seamless modular architecture of EvoGraph allows it to integrate new languages with ease. Such adaptability positions EvoGraph as a valuable asset for tackling a broad range of modernization projects across industries.

Conclusion

EvoGraph presents a sophisticated approach for software modernization through directed graph-based evolution and the utilization of SLMs. Its results indicate not only practical improvements in key performance metrics but also alignment with industry needs for cost efficiency and flexibility in legacy modernization initiatives. EvoGraph's demonstrated ability to improve semantic equivalence while significantly cutting costs supports the use of SLMs in agentic systems, providing a compelling argument for their further adoption and development in this field.