SolAgent: Multi-Agent Smart Contract Generator

Updated 3 February 2026

SolAgent is a tool-augmented multi-agent framework for generating secure Solidity smart contracts by iteratively refining code using testing and static analysis.
It employs specialized agents in a dual-loop architecture, using Forge for functional correctness and Slither for detecting and mitigating security vulnerabilities.
Experiments on the SolEval+ benchmark show that SolAgent significantly improves Pass@1 scores and reduces vulnerabilities compared to baseline language models.

SolAgent is a specialized, tool-augmented multi-agent framework designed to generate secure and correct Solidity smart contracts, emulating the iterative code development workflow of human experts. By combining LLMs, program analysis tools, and file-system operations within a dual-loop architecture, SolAgent directly targets two persistent challenges in smart contract generation: functional correctness (passing all specified tests) and security (absence of vulnerabilities). Experiments on the SolEval+ benchmark demonstrate that SolAgent achieves superior performance in Pass@1 and vulnerability reduction compared to baseline LLMs, code assistants, and generic agent frameworks (Chen et al., 30 Jan 2026).

1. Multi-Agent Architecture and Core Components

SolAgent employs a division of labor across specialized agents:

Coding Agent: Receives natural language requirements $R$ and the project context, producing an initial Solidity source file $C_0$ .
Refining Agent: Takes the latest code artifact $C_{t-1}$ and aggregated feedback $F_t$ (encompassing test and security results), then outputs a refined version $C_t$ , correcting errors or mitigating detected vulnerabilities.

A dual-loop refinement mechanism orchestrates the agent interactions:

Inner Correctness Loop (via Forge): Utilizes the Foundry/Forge compiler and associated test harness to run comprehensive test suites, providing feedback such as test pass rates and specific assertion or stack trace failures ( $F_\text{forge}$ ).
Outer Security Loop (via Slither): Integrates the Slither static analyzer, which reports potential vulnerabilities with per-alert severity (Low/Medium/High), guiding security-related refactoring (e.g., enforcing checks-effects-interactions or adding access controls).

File-system tools (list_dir, read_file) empower the framework to reason across repository structures, load dependencies, and manage context, enabling multi-file smart contract synthesis beyond isolated code snippets.

2. Algorithmic Workflow and Stopping Strategies

The iterative workflow dynamically alternates between correctness and security refinement, employing both success and early-stopping mechanisms:

Dynamic Stopping Conditions:

Success: All tests pass ( $\text{pass rate} = 1.0$ ), and no unresolved high or medium severity vulnerabilities remain.
Stagnation: No progress in either pass rate or vulnerability count for $N$ consecutive rounds (default $N=2$ ).
Oscillation: Feedback similarity between consecutive rounds exceeds threshold $\tau$ (e.g., $\tau=0.9$ using sequence matching ratio), indicating the agent is trapped in a repetitive cycle.

Feedback Aggregation: At each round, outputs from Forge (functionality correctness) and Slither (security inspection) are serialized and aggregated for downstream decision-making.
Best Solution Selection: Track the best-performing code artifact $C_{\text{best}}$ and associated score (weighted by pass rate and negative vulnerability count) across all refinement rounds.
Outline Pseudocode:

# Pseudocode summary
Input: R (requirements), T (tests), MaxRounds, N (stagnation), τ (oscillation)
Output: C_best
C0 = CodingAgent.generate(R)
C_best = C0; Score_best = -inf
for t in 1..MaxRounds:
    pass_rate, failures = RunForge(C_{t-1}, T)
    vulnerabilities = RunSlither(C_{t-1})
    Ft = aggregate(pass_rate, failures, vulnerabilities)
    if (pass_rate == 1.0 and no high/med vulns) or stagnation or oscillation:
        break
    C_t = RefiningAgent.refine(C_{t-1}, Ft)
    score_t = weight(pass_rate, -|vulns|)
    if score_t > Score_best: C_best = C_t; Score_best = score_t
return C_best

3. Metrics and Experimental Results

SolAgent is evaluated using the SolEval+ benchmark, comprising 81 file-level Solidity tasks and 1,188 hand-written Forge tests assessing key correctness and security properties.

Evaluation Metrics:
- Pass@k:
$\text{Pass@}k = \mathbb{E} \left[1 - \frac{C(n-c,\,k)}{C(n,\,k)} \right]$

where $n$ is the number of samples, $c$ the passing samples, and $C(\cdot,\cdot)$ denotes the binomial coefficient. - Compile Rate:

$\text{Rate}_\text{compile} = \frac{1}{N} \sum_{i=1}^N \mathbb{1}[\text{compile}(s_i)]$ - Vulnerability Reduction:

$\Delta V\% = \frac{V_\text{base} - V_\text{SolAgent}}{V_\text{base}} \times 100$

Key experimental outcomes (Pass@1, Compile Rate):

Source / Model	CompileRate	Pass@1
Human Baseline Repo	100.00%	100.00%
Vanilla LLM (Claude)	39.51%	25.59%
GitHub Copilot	32.10%	10.02%
DeepCode	37.04%	13.55%
MetaGPT	35.80%	11.78%
Qwen-Agent	45.68%	28.37%
SolAgent (Claude)	95.06%	64.39%

SolAgent achieves a 127.1% relative Pass@1 improvement over the best vanilla LLM (Qwen-Agent: 28.37%). In total vulnerability reduction, SolAgent achieves 15.7% fewer static alerts compared to the human baseline (on 77 shared files: 293 vs. 247 alerts). Using a GPT-5-Mini base, a maximum reduction of 39.77% is observed (259 to 156 alerts).

Statistical variance for Pass@1 per correctly compiled file is reported (e.g. SolAgent(Claude): $0.7795 \pm 0.2941$ ).

4. Knowledge Distillation Procedure

SolAgent's high-quality trajectories are used as distillation data to train smaller open-source models employing instruction-following and demonstration learning paradigms:

Trajectory Collection:
- Full-Context: Original requirements plus detailed comments, agent dialogue, and final artifact.
- Compressed-Context: Summarized requirements and new agent-generated dialogues.
Supervised Training Objective:

$L(\theta) = -\sum_i \log P_\theta(C^*_i~|~R,~C^*_{<i})$

Model Variants:
- Base: Qwen3-8B; Enhanced: Qwen3-32B
- Tracker variants (v1: forward truncation, v2: backward truncation, 4K tokens).

Distillation results on the held-out test set:

Model	CompileRate	Pass@1
Qwen3-8B (base)	5.88%	0.33%
Qwen3-32B	35.29%	1.31%
tracker-v2	17.65%	1.31%

Tracker-v2 quadruples Pass@1 over the base model and matches 32B performance, demonstrating the effectiveness of agent-generated distillation data.

5. Ablation Studies and Key Insights

Ablation analyses highlight the critical contributions of SolAgent’s core components:

Removal of Inner Loop (Forge): Pass@1 drops from 64.39% to approximately 26% (Claude).
Removal of Outer Loop (Slither): Increases vulnerabilities by 25–35% (worst-case min-vuln round).
Exclusion of File-System Capabilities: Pass@1 reduces by approximately 7–22%, underscoring the importance of project context and cross-file reasoning.

This suggests that both functional and security feedback, as well as file-level context, are essential to high-quality smart contract generation in agent-based frameworks.

6. Limitations and Future Extensions

SolAgent currently targets single-contract files. Ongoing and future directions include extension to cross-contract systems, deeper integration of formal verification via SMT solvers or proof assistants (e.g., Coq, LiquidHaskell), and transposition of the tool-augmented multi-agent paradigm to other safety-critical domains, such as automotive and aerospace software.

7. System Schematic and Workflow Overview

The SolAgent pipeline can be represented as:

R → Coding Agent → C₀
           └────────► Refining Loop ◄────────┘
                  [Forge → test feedback]
                  [Slither → security feedback]
                  [FileSystem → context]
     dynamic stopping & best-code tracking
                                ↓
                           C_best

This workflow underpins SolAgent’s approach to robust, secure, and scalable smart contract generation (Chen et al., 30 Jan 2026). The open-source release is available at https://github.com/openpaperz/SolAgent.

Markdown Upgrade to Chat

References (1)

SolAgent: A Specialized Multi-Agent Framework for Solidity Code Generation (2026)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SolAgent.

SolAgent: Multi-Agent Smart Contract Generator

1. Multi-Agent Architecture and Core Components

2. Algorithmic Workflow and Stopping Strategies

3. Metrics and Experimental Results

4. Knowledge Distillation Procedure

5. Ablation Studies and Key Insights

6. Limitations and Future Extensions

7. System Schematic and Workflow Overview

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

SolAgent: Multi-Agent Smart Contract Generator

1. Multi-Agent Architecture and Core Components

2. Algorithmic Workflow and Stopping Strategies

3. Metrics and Experimental Results

4. Knowledge Distillation Procedure

5. Ablation Studies and Key Insights

6. Limitations and Future Extensions

7. System Schematic and Workflow Overview

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research