Scheduzz: Constraint-based Fuzz Driver Generation with Dual Scheduling (2507.18289v1)
Abstract: Fuzzing a library requires experts to understand the library usage well and craft high-quality fuzz drivers, which is tricky and tedious. Therefore, many techniques have been proposed to automatically generate fuzz drivers. However, they fail to generate rational fuzz drivers due to the lack of adherence to proper library usage conventions, such as ensuring a resource is closed after being opened. To make things worse, existing library fuzzing techniques unconditionally execute each driver, resulting in numerous irrational drivers that waste computational resources while contributing little coverage and generating false positive bug reports. To tackle these challenges, we propose a novel automatic library fuzzing technique, Scheduzz, an LLM-based library fuzzing technique. It leverages LLMs to understand rational usage of libraries and extract API combination constraints. To optimize computational resource utilization, a dual scheduling framework is implemented to efficiently manage API combinations and fuzz drivers. The framework models driver generation and the corresponding fuzzing campaign as an online optimization problem. Within the scheduling loop, multiple API combinations are selected to generate fuzz drivers, while simultaneously, various optimized fuzz drivers are scheduled for execution or suspension. We implemented Scheduzz and evaluated it in 33 real-world libraries. Compared to baseline approaches, Scheduzz significantly reduces computational overhead and outperforms UTopia on 16 out of 21 libraries. It achieves 1.62x, 1.50x, and 1.89x higher overall coverage than the state-of-the-art techniques CKGFuzzer, Promptfuzz, and the handcrafted project OSS-Fuzz, respectively. In addition, Scheduzz discovered 33 previously unknown bugs in these well-tested libraries, 3 of which have been assigned CVEs.