KernelGPT: Enhanced Kernel Fuzzing via Large Language Models (2401.00563v3)

Published 31 Dec 2023 in cs.CR, cs.AI, and cs.SE

Abstract: Bugs in operating system kernels can affect billions of devices and users all over the world. As a result, a large body of research has been focused on kernel fuzzing, i.e., automatically generating syscall (system call) sequences to detect potential kernel bugs or vulnerabilities. Kernel fuzzing aims to generate valid syscall sequences guided by syscall specifications that define both the syntax and semantics of syscalls. While there has been existing work trying to automate syscall specification generation, this remains largely manual work, and a large number of important syscalls are still uncovered. In this paper, we propose KernelGPT, the first approach to automatically synthesizing syscall specifications via LLMs for enhanced kernel fuzzing. Our key insight is that LLMs have seen massive kernel code, documentation, and use cases during pre-training, and thus can automatically distill the necessary information for making valid syscalls. More specifically, KernelGPT leverages an iterative approach to automatically infer the specifications, and further debug and repair them based on the validation feedback. Our results demonstrate that KernelGPT can generate more new and valid specifications and achieve higher coverage than state-of-the-art techniques. So far, by using newly generated specifications, KernelGPT has already detected 24 new unique bugs in Linux kernel, with 12 fixed and 11 assigned with CVE numbers. Moreover, a number of specifications generated by KernelGPT have already been merged into the kernel fuzzer Syzkaller, following the request from its development team.

References (54)

Citations (6)

View on Semantic Scholar

Summary

The paper introduces KernelGPT, which automates syscall specification generation using LLMs to enhance kernel fuzzing.
It employs an iterative process with driver detection, specification generation, and validation, achieving 129 new syscall descriptions and 21.3% increased coverage.
KernelGPT outperforms manual methods by uncovering previously unreported bugs and significantly strengthening Linux kernel security testing.

KernelGPT: Advancing Kernel Fuzzing with LLMs

The paper "KernelGPT: Enhanced Kernel Fuzzing via LLMs" presents a pioneering approach to improving kernel fuzzing by leveraging the capabilities of LLMs. Specifically, it introduces KernelGPT, a novel methodology for automatically inferring Syzkaller specifications using LLMs. This approach addresses the current limitations in kernel fuzzing, which rely heavily on manual processes to generate extensive syscall sequences and specifications, a task that is both labor-intensive and prone to errors due to the constantly evolving nature of kernel codebases.

Overview of Kernel Fuzzing and Syzkaller

Kernel fuzzing is an essential technique for uncovering potential bugs in operating system kernels, which are foundational to system stability and security. Syzkaller has emerged as one of the most proficient kernel fuzzers, utilizing a domain-specific language called syzlang to define syscall sequences. Despite its efficacy, the generation of Syzkaller specifications is largely manual and suffers from incomplete coverage of syscalls, notably in complex areas like device drivers.

Contributions of KernelGPT

KernelGPT distinguishes itself as the first approach harnessing the analytical prowess of LLMs to automate the generation of syscall specifications. The methodology outlined in the paper involves an iterative analysis process where LLMs are employed to distill vast kernel-related datasets into accurate and useful syscall specifications. KernelGPT's procedures are broken down into driver detection, specification generation, and specification validation and repair.

Driver Detection: Utilizing LLMs to infer device names and discern initialization descriptions from device operation handlers by leveraging code references.
Specification Generation: An iterative process where LLMs analyze related source code to deduce command values and argument types for ioctl handlers. The process is segmented into stages to allow LLMs to focus on discrete subtasks, thus improving type analysis and synthesis capabilities.
Specification Validation and Repair: This phase involves using validation feedback to identify and rectify errors, ensuring the accuracy of the generated specifications.

Empirical Findings

The authors conducted a comprehensive evaluation of KernelGPT on the Linux kernel, version 6.7. The tool was able to generate valid and executable specifications for various undescribed drivers, yielding an additional 129 syscall descriptions and achieving 21.3% more line coverage in fuzzing tests compared to baseline methods.

KernelGPT was also able to reveal eight crashes in new drivers, with seven of these being previously unreported bugs. These findings provide implicit validation of the improved capacity for testing and bug identification that KernelGPT facilitates, as opposed to manually generated specifications.

Comparative Analysis and Implications

When compared with contemporary methods like SyzDescribe and existing Syzkaller specifications, KernelGPT exhibited superior performance in coverage metrics and type analysis for the selected drivers. The specifications produced by KernelGPT contributed to both higher coverage numbers and effective bug identification, underscoring the importance and potential impact of automating specification generation through LLMs.

Future Directions

KernelGPT's approach opens promising avenues for further research in the integration of LLMs with kernel fuzzing techniques. Future work could explore more intricate and diverse driver settings, as well as the adoption of KernelGPT in generating specifications directly from binary codebases. Additionally, there is potential for expanding the application's scope to incorporate LLM-generated seeds and mutations, enhancing the fuzzing process's depth and breadth.

In conclusion, this paper substantiates the feasibility and effectiveness of integrating LLMs into the kernel fuzzing domain, transforming a traditionally manual task into a more automated and efficient process. This advancement holds substantial promise for improving system security and reliability through heightened bug detection capabilities.

PDF Markdown

Tweets

https://twitter.com/linkersec/status/1758270464114168287