Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 91 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 24 tok/s Pro
GPT-4o 95 tok/s Pro
Kimi K2 209 tok/s Pro
GPT OSS 120B 458 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation (2507.12084v1)

Published 16 Jul 2025 in cs.SE and cs.CR

Abstract: Smart contracts play a pivotal role in blockchain ecosystems, and fuzzing remains an important approach to securing smart contracts. Even though mutation scheduling is a key factor influencing fuzzing effectiveness, existing fuzzers have primarily explored seed scheduling and generation, while mutation scheduling has been rarely addressed by prior work. In this work, we propose a LLMs-based Multi-feedback Smart Contract Fuzzing framework (LLAMA) that integrates LLMs, evolutionary mutation strategies, and hybrid testing techniques. Key components of the proposed LLAMA include: (i) a hierarchical prompting strategy that guides LLMs to generate semantically valid initial seeds, coupled with a lightweight pre-fuzzing phase to select high-potential inputs; (ii) a multi-feedback optimization mechanism that simultaneously improves seed generation, seed selection, and mutation scheduling by leveraging runtime coverage and dependency feedback; and (iii) an evolutionary fuzzing engine that dynamically adjusts mutation operator probabilities based on effectiveness, while incorporating symbolic execution to escape stagnation and uncover deeper vulnerabilities. Our experiments demonstrate that LLAMA outperforms state-of-the-art fuzzers in both coverage and vulnerability detection. Specifically, it achieves 91% instruction coverage and 90% branch coverage, while detecting 132 out of 148 known vulnerabilities across diverse categories. These results highlight LLAMA's effectiveness, adaptability, and practicality in real-world smart contract security testing scenarios.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper presents an innovative fuzzing framework using LLMs for high-quality seed generation and enhanced smart contract vulnerability detection.
  • It employs a multi-feedback optimization strategy combining evolutionary algorithms with symbolic execution to maximize coverage.
  • Experimental results show 91% instruction coverage and detection of 132 vulnerabilities, demonstrating LLAMA's effectiveness over traditional fuzzers.

LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation

Introduction

LLAMA introduces an innovative smart contract fuzzing framework utilizing LLMs for seed generation, evolutionary mutation strategies, and hybrid testing techniques, targeting smart contract security in blockchain systems. The framework's pivotal function rests in its ability to enhance fuzz testing via high-quality seed generation, adaptive feedback integration across the fuzzing process, and efficient hybrid execution methodologies.

Framework Architecture

Figure 1

Figure 1: Architecture of the proposed LLAMA.

1. LLM-Based Initial Seed Generation

The LLM-Based Initial Seed Generation module of LLAMA employs a five-layer hierarchical prompting strategy to harness the semantic power of LLMs for generating structurally and semantically valid initial seeds. The process initiates with function abstraction followed by transaction sequence inference, format verification, semantic optimization, and behavior-guided prompt injection. This multistage approach facilitates the synthesis of high-quality inputs capable of exploring deeper contract logic effectively. Figure 2

Figure 2: LLM-based seed generation in LLAMA.

2. Multi-Feedback Optimization Strategy

The strategic foundation of LLAMA's multi-feedback optimization involves dynamically leveraging different runtime feedback signals to enhance seed generation, selection, and mutation scheduling. Feedback incorporates both control flow and semantic insights, crucial for driving exploration in smart contract fuzzing and maximizing coverage and vulnerability detection. Figure 3

Figure 3: Illustration of the evolutionary scheduling process for mutation operator selection in LLAMA.

3. Hybrid Fuzzing Engine

By integrating evolutionary algorithms with symbolic execution selectively, the Hybrid Fuzzing Engine in LLAMA addresses path stagnation effectively. This engine leverages genetic algorithms for efficient exploration but triggers symbolic execution when coverage growth shows diminishing returns, ensuring exploration beyond shallow paths.

Implementation Details

Hierarchical Prompting Strategy

For the hierarchical prompting within LLM-based seed generation, the process involves breaking down tasks into actionable prompts:

1
2
3
4
5
6
7
8
def hierarchical_prompting(contract_code):
    function_abstract = extract_function_behavior(contract_code)
    transaction_sequence = infer_transaction_sequences(function_abstract)
    verified_sequence = verify_format(transaction_sequence)
    optimized_sequence = optimize_semantics(verified_sequence)
    final_prompt = inject_behavior_guidance(optimized_sequence)
    
    return generate_seeds(final_prompt)

Feedback-Guided Mutation Strategy

The mutation strategy employs fitness functions using runtime feedback such as instruction coverage and branch coverage to iteratively refine mutation operator probabilities, embedded with randomness to encourage exploration of novel paths:

1
2
3
4
5
def feedback_guided_mutation(strategy, population):
    for each_seed in population:
        branches, instructions = execute_and_collect(each_seed)
        fitness_score = compute_fitness(branches, instructions)
        update_operator_probabilities(fitness_score)

Experimental Results

Experiments confirm LLAMA's ability to outperform existing fuzzing tools, achieving 91% instruction coverage and detecting 132 vulnerabilities out of 148 across various smart contract benchmarks. Figure 4

Figure 4

Figure 4

Figure 4

Figure 4: Branch and instruction coverage comparison on small and large contracts.

LLAMA shows remarkable adaptability and efficacy, leveraging symbolic execution selectively to explore deep logical paths while maintaining resource efficiency compared to traditional fuzzing and hybrid fuzzing strategies. Figure 5

Figure 5: Overall coverage comparison.

Figure 6

Figure 6

Figure 6: Resource consumption comparison.

Conclusion

LLAMA establishes a new benchmark for smart contract fuzzing, offering a balanced, highly effective approach capable of addressing complex execution paths and revealing vulnerabilities with high accuracy. Its reliance on LLMs for semantically aware seed generation, coupled with a robust, feedback-driven process, marks a significant advancement in the application of AI-driven strategies for blockchain security testing. Future exploration will continue to refine LLAMA, focusing on enhancing mutation strategies and exploring deeper integrations of LLM capabilities for more sophisticated contract analysis.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.