FairFuzz: Targeting Rare Branches to Rapidly Increase Greybox Fuzz Testing Coverage (1709.07101v1)

Published 20 Sep 2017 in cs.SE and cs.CR

Abstract: In recent years, fuzz testing has proven itself to be one of the most effective techniques for finding correctness bugs and security vulnerabilities in practice. One particular fuzz testing tool, American Fuzzy Lop or AFL, has become popular thanks to its ease-of-use and bug-finding power. However, AFL remains limited in the depth of program coverage it achieves, in particular because it does not consider which parts of program inputs should not be mutated in order to maintain deep program coverage. We propose an approach, FairFuzz, that helps alleviate this limitation in two key steps. First, FairFuzz automatically prioritizes inputs exercising rare parts of the program under test. Second, it automatically adjusts the mutation of inputs so that the mutated inputs are more likely to exercise these same rare parts of the program. We conduct evaluation on real-world programs against state-of-the-art versions of AFL, thoroughly repeating experiments to get good measures of variability. We find that on certain benchmarks FairFuzz shows significant coverage increases after 24 hours compared to state-of-the-art versions of AFL, while on others it achieves high program coverage at a significantly faster rate.

Authors (2)

Caroline Lemieux (4 papers)
Koushik Sen (49 papers)

Citations (362)

View on Semantic Scholar

Summary

The paper introduces a method that prioritizes rare branches to significantly boost testing coverage.
It employs a dynamic branch mask for targeted input mutations, enabling deeper exploration of program paths.
Empirical results from nine benchmarks show FairFuzz outperforms traditional AFL variants, particularly with complex input grammars.

An In-depth Analysis of FairFuzz: Improving Greybox Fuzz Testing by Targeting Rare Branches

The paper "FairFuzz: Targeting Rare Branches to Rapidly Increase Greybox Fuzz Testing Coverage" by Caroline Lemieux and Koushik Sen explores enhancing fuzz testing methodologies, specifically through tailoring efforts to cover rare execution branches. The prevalent fuzz testing tool, American Fuzzy Lop (AFL), is noted for its efficiency in identifying bugs and vulnerabilities across various software types. However, despite its broad utility, AFL is criticized for its limited ability to achieve deep program coverage, primarily since it indiscriminately mutates all parts of program inputs without regard for the critical sections that might affect coverage. FairFuzz emerges as a solution to this limitation by refining input mutation strategies to concentrate on those elusive and rare program paths.

Methodological Enhancements in FairFuzz

FairFuzz modifies the traditional AFL workflow through two pivotal enhancements:

Rare Branch Identification: It automatically identifies and prioritizes inputs that exercise infrequently covered parts of a program. A dynamic rarity threshold is set, marked by the smallest power of two, which bounds the count of inputs that have reached a given branch (i.e., branches reached by fewer inputs are deemed rare).
Targeted Input Mutation: FairFuzz adjusts input mutations to retain the components of inputs critical to covering these rare branches. This is operationalized by constructing a "branch mask" for each rare input, identifying input positions that can potentially change without hindering branch coverage. The deterministic and havoc mutation stages then utilize this mask to enhance the probability that resultant inputs continue to traverse these rare branches.

Empirical Evaluation and Findings

The authors evaluate FairFuzz across nine benchmarks, comparing its efficiency with variants of AFL, including AFLFast and FidgetyAFL. The results indicate that FairFuzz generally achieves increased coverage more rapidly, especially across programs with nested condition logic, an observation that underscores the efficacy of focusing on rare paths to boost coverage. Notably, FairFuzz outperformed in benchmarks with complex input grammars, such as XML parsing.

The paper also underscores the importance of statistical robustness, repeating each experiment 20 times to account for AFL's intrinsic non-determinism and providing confidence intervals to validate the findings comprehensively. This methodological rigor suggests that such repetition is crucial to making validated comparisons across different fuzzing strategies.

Implications and Future Directions

FairFuzz's novel strategy of concentrating on rare branches could profoundly impact both theoretical and practical facets of fuzz testing:

Theoretical Contributions: By linking the focus on rare branches with measurable improvements in code coverage, the paper contributes to advancing how greybox deviates from the traditional blackbox methods. It demonstrates a more surgical approach to uncovering program vulnerabilities by intelligently navigating program paths that are statistically underexplored.
Practical Applications: The findings suggest that tools adopting FairFuzz's methods could result in more efficient identification of bugs, enhancing automated testing frameworks within secure software engineering.

Furthermore, the methodology discussed in FairFuzz has potential extensions beyond rare branch targeting. For instance, targeting branches within specific code sections—such as newly added features or historically error-prone modules—can be a promising avenue, leveraging the branch mask to optimize testing strategies.

Conclusion

FairFuzz presents an effective enhancement to existing greybox fuzzers by specifically targeting rare branches, thereby achieving more comprehensive and rapid program coverage. Its robust methodology and empirical validation suggest that this approach is not only viable but also advantageous for modern software testing frameworks. Future fuzzing tools stand to benefit greatly from integrating such strategies, tailoring testing processes to be more precise and effective in uncovering hidden vulnerabilities.

PDF Markdown