Papers
Topics
Authors
Recent
2000 character limit reached

OSS-Fuzz-Gen: AI-Powered Fuzz Driver Generation

Updated 3 October 2025
  • OSS-Fuzz-Gen is a multi-agent, AI-augmented system that synthesizes fuzz drivers using constraint-based and context-aware strategies.
  • It employs an LLM-driven Function Analyzer to embed precise usage constraints, raising the Constraint Satisfaction Ratio from 38.9% to 63.1%.
  • The system’s Crash Validation Agent filters up to 65% of spurious crashes, significantly reducing false positives while maintaining robust code coverage.

OSS-Fuzz-Gen is a multi-agent, AI-augmented system for automated fuzz driver generation at scale. It targets the key challenge faced by industry-scale fuzzing systems: the need to automatically synthesize high-fidelity fuzz drivers for thousands of functions, while minimizing the number of false positive crashes reported to developers. The system integrates proactive and reactive AI strategies, leveraging advanced LLM “agents” to both enforce constraints during driver synthesis and validate crash reports in their broader code context. Both strategies address the major impediment to trust and scalability in automated fuzzing—namely, the high volume of false positive bug reports caused by infeasible or invalid driver invocations—by using AI for program analysis, constraint extraction, and contextual triage (Amusuo et al., 2 Oct 2025).

1. Constraint-Based Fuzz Driver Generation

The constraint-based generation strategy injects symbolic and semantic usage information into the driver synthesis workflow to reduce spurious bug reports from invalid function usage. The process begins with an LLM-based Function Analyzer Agent that, provided with the candidate function's signature, implementation, and usage context, synthesizes a list of preconditions and invariants:

  • Data type requirements (e.g., “first argument must be a pointer to a valid CrxImage structure”)
  • Validity or range boundaries on parameters (e.g., “planeNumber < nPlanes”)
  • Stateful initialization/teardown (e.g., “call crxSetupImageData first”)
  • Structural relationships among parameters (e.g., “length field must match buffer size”)

These constraints are formatted (preferrably in structured YAML or bullet lists) and injected directly into the LLM prompt for the fuzz driver generator agent. As a result, the generated driver adheres closely to real-world calling conventions and context, making it much less likely to trigger so-called “false” bugs. Formally, the benefit is captured in the "Constraint Satisfaction Ratio" (CSR):

CSR=# constraints satisfied by the generated driver# constraints imposed\text{CSR} = \frac{\text{\# constraints satisfied by the generated driver}}{\text{\# constraints imposed}}

Empirically, the addition of constraint-based prompts raised the rate of drivers fully satisfying all constraints from 38.9% (unconstrained) to 63.1% (constrained).

2. Context-Based Crash Validation

While constraint-based prompts prevent many invalid invocations, some crashes originate from code paths that are infeasible in documented real-world usage. To address this, OSS-Fuzz-Gen applies context-based crash validation as a post-fuzzing filter.

The Crash Validation Agent, again driven by an LLM, analyzes each crash as follows:

  1. Receives crash stacktrace, crash log, and root cause output (from standard crash analyzers).
  2. Identifies the crashing function and input pattern.
  3. Queries the code base to reconstruct candidate call chains and routine input-validation logic for the function.
  4. Compares the triggering conditions and actual function context against publicly accessible entry points and standard usage protocols.
  5. Flags the crash as “spurious” if—from all reconstructed callers—it is not possible to reach the crashing state due to mandatory validation, missing initialization, or constraint violations.

The agent justifies its classification with code-level citations and logical rationale. In the studied benchmarks, between 57% and 65% of crashes initially flagged as bugs were later shown, by human-audited LLM analyses, to be spurious—i.e., unreachable from documented calling contexts.

3. Architecture and LLM Agents

OSS-Fuzz-Gen’s “multi-agent” architecture is designed to operate at industry scale, integrating intelligent program analysis and reasoning at multiple stages:

  • Function Analyzer Agent: Receives signatures, code, and context; outputs constraints for driver generation.
  • Fuzz Driver Generator Agent: Synthesizes drivers, guided by explicit constraint prompts.
  • Crash Validation Agent: Assesses feasibility of reported crashes by analyzing call context, validation paths, and usage constraints.

These agents orchestrate with deterministic code search and static analysis components to enhance reliability and reduce hallucination risk. The prompt engineering, code search, and analysis methods leverage both shell-based code querying and semantic analysis embedded in LLM responses.

4. Evaluation Results and Metrics

The evaluation leverages a dataset of 1,555 functions from OSS-Fuzz projects:

  • Crash Reduction: Applying constraint-based generation reduced the number of reported crashes from 5,080 to 4,835 (a reduction of ≈15%), with coverage remaining nearly unchanged (62.3% vs. 61.4%).
  • Constraint Satisfaction: Full satisfaction of all constraints in generated drivers increased from 38.9% (unconstrained) to 63.1% (constrained).
  • Crash Validation Impact: In the subset of “program error” crashes, the validation agent filtered out over half (57–65%) as spurious by analyzing call feasibility.
  • Cost and Practicality: Overheads for integrating these agents were modest (~9.3% per driver), which is acceptable in continuous fuzzing at scale.
  • Human Trust: The multi-stage filtering demonstrably removes the vast majority of false positives, directly addressing industry concerns about trust and maintainability.

5. Significance and Broader Implications

The composite AI-driven approach of OSS-Fuzz-Gen demonstrates that the integration of LLM-based agents into both proactive and reactive stages of fuzzing can:

  • Substantially reduce manual triage workloads in continuous fuzzing campaigns
  • Increase the proportion of actionable, high-confidence bug reports delivered to maintainers
  • Maintain overall code coverage while significantly reducing the noise of non-actionable findings

A further implication is the viability of LLMs as agentic program analysis experts. Unlike purely statistical or static tools, LLMs—when harnessed with proper prompting and context—are capable of extracting subtle invariants, modeling call context, and reasoning about feasibility in large C/C++ codebases.

6. Methodological Details

  • Algorithmic Summaries:
    • Constraint-Based Driver Generation:
    • 1. Analyze function signature/source/context.
    • 2. Derive constraint set C={c1,...,cn}\mathcal{C} = \{c_1, ..., c_n\}.
    • 3. Prompt driver generator with code + C\mathcal{C}.
    • 4. Output: driver highly likely to satisfy real usage patterns.
    • Context-Based Crash Validation:
    • 1. Input: (stacktrace, log, root-cause) for each crash.
    • 2. Query call context and protocol constraints.
    • 3. Classify crash as infeasible (“false positive”) if input conditions cannot be met from legal program entrypoints.
  • Key Metrics:
    • CSR\text{CSR} (Constraint Satisfaction Ratio)
    • Number of crashes before/after constraint-based and context-based filtering
    • Code coverage pre-and-post intervention

7. Limitations and Future Directions

The system’s efficacy relies on the quality of source-level annotations, function naming, and the LLM’s capacity to interpret C/C++ idioms robustly. The solution is not hard-coupled to OSS-Fuzz-Gen and is adaptable to other automated fuzzing and test generation contexts. Additional opportunities exist to refine the agents through domain-adapted prompting, corpus feedback, and richer integration with static and dynamic analysis primitives.


In summary, OSS-Fuzz-Gen’s AI-augmented approach to automated fuzz driver generation delivers measurable reductions in false positive bug reports without sacrificing code coverage, validating the role of agentic LLMs in large-scale, continuous software security testing. Both constraint-driven driver synthesis and context-aware crash validation are critical for high-trust deployment of automated fuzzing systems in open source and industry settings (Amusuo et al., 2 Oct 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to OSS-Fuzz-Gen.