Papers
Topics
Authors
Recent
Search
2000 character limit reached

Syzkaller Specifications

Updated 28 May 2026
  • Syzkaller specifications are formally structured syzlang descriptions that define syscall syntax, semantics, arguments, and inter-call dependencies.
  • They enable automated syscall sequence generation and mutation through iterative LLM inference with KernelGPT, enhancing both code coverage and bug discovery.
  • Empirical evaluations show that these specifications can increase line coverage by up to 21% and discover 28% more crashes in fuzzing runs.

Syzkaller specifications are formally structured descriptions written in a custom domain-specific language, syzlang, which define the syntax, semantics, arguments, return types, and inter-call dependencies of system calls (syscalls) for the Syzkaller kernel fuzzer. These specifications enable automated generation and mutation of valid syscall sequences, thereby allowing Syzkaller to test kernel code for correctness and security vulnerabilities efficiently. Recent research, notably by Yang et al. (2024), introduced KernelGPT, a method that harnesses LLMs for the automatic inference, validation, and integration of new syscall specifications, significantly improving both line coverage and bug-finding rates (Yang et al., 2023).

1. Structure and Semantics of syzlang

Syzkaller specifications use syzlang, a concise domain-specific language, to capture the formal interface of each syscall. The language supports the declaration of:

  • Syscall definitions (with optional instances), argument lists, return types, and parameter annotations.
  • Supported types include primitive integer/floating/enum types, pointers, arrays (with fixed or variable lengths), resources, structs, and type aliases.
  • Field and parameter annotations for argument directionality: in, out, and inout.
  • Explicit inter-call dependencies, especially for resource parameters (which must be produced by preceding syscalls).

The core BNF grammar is:

1
2
3
4
5
6
7
8
9
10
11
12
13
<Spec>        ::= <SyscallDef>*
<SyscallDef>  ::= “sys” <Name> [“$” <Inst>] “(” [<ParamList>] “)” [“->” <Type>] “{” <FieldSpecs> “}”
<ParamList>   ::= <Param> (“,” <Param>)*
<Param>       ::= <Type> <Identifier> [“=” <ConstExpr>] [“<” <Annot> “>”]
<Annot>       ::= “in” | “out” | “inout”
<FieldSpecs>  ::= (<Identifier> “.” <Identifier> “:” <Annot> [“[” <ConstExpr> “]”])*
<Type>        ::= “int” | “uint” | “long” | “flags” “[” <EnumName> “]”
                  | “ptr” “[” <Type> “]”
                  | “array” “[” <Type> [“,” <ConstExpr>] “]”
                  | “resource” “[” <ResName> “]”
                  | <StructName> “_struct”
                  | <Identifier>
<ConstExpr>   ::= INTEGER_LITERAL | <Identifier> | <Identifier> “+” <Integer> | …

Key semantic constraints enforce that:

  • resource arguments are data dependencies on earlier syscalls.
  • out/inout-annotated fields must correspond to kernel structs.
  • Arrays have fixed or explicitly declared variable lengths; the latter must reference a separate length parameter (e.g., len[devices]).

2. Automated Inference via KernelGPT

KernelGPT introduces an iterative LLM-driven workflow for automatic specification synthesis, consisting of three phases:

  1. Driver Detection: Locates device operation handlers (e.g., through LLVM-based pattern search for fops structs), extracts C struct definitions and usage sites, and prompts the LLM (with code context and concrete examples) to propose correct syzlang syscall instances (e.g., openat$dm_control(&quot;/dev/mapper/control&quot;)</code>).</li> <li><strong>Specification Generation</strong>: For each candidate driver and ioctl handler: <ul> <li>Command values, argument types, and needed struct/type definitions are inferred via recursive LLM queries on progressively broader code slices.</li> <li>Algorithm 1 (presented below) formalizes this recursive process:</li> </ul></li> </ol> <p>$\begin{algorithm}[H] \Input{related source code~SS, usage info~UU, iteration~kk} \Output{inferred spec fragment~RR or \bot~if~failed} \If{$k > \maxiter$}{\Return \bot} Prompt \leftarrow BuildPrompt(S,U)\; (R,\; \mathrm{UNKNOWN}) \leftarrow QueryLLM(Prompt)\; \If{\mathrm{UNKNOWN} = \emptyset}{\Return R} \ForEach{entry~(f,t,u) \in \mathrm{UNKNOWN}}{ S' \leftarrow ExtractCode(f,t)\; R' \leftarrow \Call{Analyze}{S',u,k+1}\; R \leftarrow Update(R,R')\; } \Return R\; \end{algorithm}</p><p>Thisprocessisseparatelyinstantiatedforeachofcommandvalueinference,argumenttypeinference,andtypedefinitionsynthesis.</p><ol><li><strong>SpecificationValidationandRepair</strong>:CompletedsyzlangfragmentsareparsedbySyzkallers<code>syzextract</code>tooltodetectsyntacticorreferenceerrors.Foreacherror,KernelGPTpromptstheLLMwiththeerroneouscode,errormessage,andrelevantcontexttoproducearepair.</li></ol><p>Validatedfragmentsarestoredas<code>.syz</code>filesinthe<code>syscalldescriptions</code>directory.</p><h2class=paperheadingid=evaluationcoverageandimpactmetrics>3.Evaluation,Coverage,andImpactMetrics</h2><p>KernelGPTsqualityisevaluatedbyitsimprovementincodecoverageduringfuzzingruns.Theprincipalmetricisthelinecoveragegainforadriver</p> <p>This process is separately instantiated for each of command value inference, argument type inference, and type definition synthesis.</p> <ol> <li><strong>Specification Validation and Repair</strong>: Completed syzlang fragments are parsed by Syzkaller’s <code>syz-extract</code> tool to detect syntactic or reference errors. For each error, KernelGPT prompts the LLM with the erroneous code, error message, and relevant context to produce a repair.</li> </ol> <p>Validated fragments are stored as <code>.syz</code> files in the <code>syscall_descriptions</code> directory.</p> <h2 class='paper-heading' id='evaluation-coverage-and-impact-metrics'>3. Evaluation, Coverage, and Impact Metrics</h2> <p>KernelGPT’s quality is evaluated by its improvement in code coverage during fuzzing runs. The principal metric is the line coverage gain for a driver d:</p><p>:</p> <p>\mathrm{Cov}_{\mathrm{LLM}}(d) = \# \text{lines covered by Syzkaller with LLM-generated specs}</p><p></p> <p>\mathrm{Cov}_{\mathrm{base}}(d) = \# \text{lines covered by baseline specs}</p><p></p> <p>\Delta_{\mathrm{cov}}(d) = \frac{\mathrm{Cov}_{\mathrm{LLM}}(d) - \mathrm{Cov}_{\mathrm{base}}(d)}{\mathrm{Cov}_{\mathrm{base}}(d)} \times 100\%</p><p>For</p> <p>For ndrivers,overallcoverageimprovementis:</p><p> drivers, overall coverage improvement is:</p> <p>\Delta_{\mathrm{cov}}^{\mathrm{all}} = \frac{\sum_d \mathrm{Cov}_{\mathrm{LLM}}(d) - \sum_d \mathrm{Cov}_{\mathrm{base}}(d)} {\sum_d \mathrm{Cov}_{\mathrm{base}}(d)} \times 100\%$

    Aggregate results show that newly added specifications cover 6668 unique lines (5% of Syzkaller’s baseline of 143,838 lines). Integrated fuzzing using 129 new call descriptions plus 3912 existing ones finds 28% more crashes during a 24-hour run. On existing drivers, KernelGPT yields 21% higher line coverage than the prior art, SyzDescribe.

    4. Specification Quality and Correction: Empirical Examples

    The KernelGPT pipeline produces valid and executable syscall specifications at a substantially increased rate compared to earlier methods. Its validation and repair loop resolves common errors in initial LLM outputs, often arising from misuse of syzlang-specific rules regarding struct fields or array lengths.

    Correctness before and after KernelGPT repair:

    Example Before (incorrect/invalid) After (KernelGPT repaired)
    Device-mapper dm_ctl_ioctl struct Incorrect nodename; incomplete/misannotated fields; fixed-length array misused Accurate nodename, correct use of device fd, correct array and output annotation conventions
    vfio_pci_hot_reset_info struct Variable array lengths not permitted; invalid field references Constant for fixed arrays, var-length arrays use separate len parameter compliant with syzlang

    KernelGPT's iterative repair and validation achieve high post-repair validity and executability rates, as summarized in experimental tables:

    #Drivers #Generated #Valid #Valid after Repair #Executable
    50 39/50 24/39 32/39 17/32

    5. Integration with Syzkaller Fuzzing Infrastructure

    Validated .syz specification files generated or repaired by KernelGPT are incorporated into the Syzkaller repository (sys/linux/ or architecture-specific directories). The integration machinery includes:

    • make extract triggers Syzkaller's syz-extract for grammar and reference checking.
    • syz-manager and syz-fuzzer ingest new syscall descriptions for routine corpus expansion, sampling, and mutation.
    • During fuzzing, all calls—built-in and generated—are treated homogeneously by the scheduler and mutator.
    • KernelGPT-generated specifications have been merged upstream into the official Syzkaller repository on developer request.

    A plausible implication is that this integration pipeline establishes a scalable, maintainable process for extending kernel model coverage as new drivers and syscalls appear.

    6. Empirical Findings: Coverage, Bugs, and Comparative Effectiveness

    Empirical evaluation demonstrates substantial gains in coverage and unique bug discovery:

    • Executable new drivers achieved, for 129 calls, coverage of 90,365 lines and 6,668 unique coverage lines in 8-hour fuzzing runs.
    • Comparison across ten “existing” drivers yields 21% higher coverage than SyzDescribe and 12% higher than Syzkaller's own legacy corpus.
    • KernelGPT-augmented Syzkaller found 24 new unique kernel bugs, with 12 fixed and 11 assigned CVEs.
    Handler/Metric #Calls Cov Unique Cov
    btrfs_control_ioctl 4 2719 20
    cec_ioctl 12 3643 402
    ... ... ... ...
    Total 129 90365 6668

    Kernel bugs detected and attributed to the new specifications include memory allocation bugs and use-after-free errors, corroborating the significance of specification coverage.

    7. Limitations and Prospects for Further Development

    Identified limitations include:

    • LLM context window capacity: Complex handlers exceeding the GPT-4 window limit sometimes result in inference failures.
    • Elaborate, multi-stage driver initialization—in particular for network and USB subsystems—remains to be addressed in future iterations.
    • KernelGPT presently does not perform explicit candidate ranking, though coverage- or signature-guided pruning could increase throughput.
    • Unexplored dimensions include LLM-driven seed selection, on-the-fly syscall synthesis, cross-driver dependency inference, and handling of closed-source modules.

    These limitations motivate ongoing research in integrating LLMs more deeply with program analysis, symbolic execution, and fuzzing seed/mutation strategies for improved automation and generality (Yang et al., 2023).

    Definition Search Book Streamline Icon: https://streamlinehq.com
    References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Syzkaller Specifications.