Revisiting Neural Program Smoothing for Fuzzing (2309.16618v1)
Abstract: Testing with randomly generated inputs (fuzzing) has gained significant traction due to its capacity to expose program vulnerabilities automatically. Fuzz testing campaigns generate large amounts of data, making them ideal for the application of ML. Neural program smoothing (NPS), a specific family of ML-guided fuzzers, aims to use a neural network as a smooth approximation of the program target for new test case generation. In this paper, we conduct the most extensive evaluation of NPS fuzzers against standard gray-box fuzzers (>11 CPU years and >5.5 GPU years), and make the following contributions: (1) We find that the original performance claims for NPS fuzzers do not hold; a gap we relate to fundamental, implementation, and experimental limitations of prior works. (2) We contribute the first in-depth analysis of the contribution of machine learning and gradient-based mutations in NPS. (3) We implement Neuzz++, which shows that addressing the practical limitations of NPS fuzzers improves performance, but that standard gray-box fuzzers almost always surpass NPS-based fuzzers. (4) As a consequence, we propose new guidelines targeted at benchmarking fuzzing based on machine learning, and present MLFuzz, a platform with GPU access for easy and reproducible evaluation of ML-based fuzzers. Neuzz++, MLFuzz, and all our data are public.
- TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. https://www.tensorflow.org. Software available from tensorflow.org.
- Andrea Arcuri. 2010. It Does Matter How You Normalise the Branch Distance in Search Based Software Testing. In International Conference on Software Testing, Verification and Validation. 205–214. https://doi.org/10.1109/ICST.2010.17
- Comparing Fuzzers on a Level Playing Field with FuzzBench. In IEEE International Conference on Software Testing, Verification and Validation - Industry (ICST Industry). https://doi.org/10.1109/ICST53961.2022.00039
- Finite-time Analysis of the Multiarmed Bandit Problem. Machine Learning 47, 2 (2002), 235–256. https://link.springer.com/article/10.1023/A:1013689704352
- Jana Aydinbas. 2022. AFLplusplus Persistence Mode README. https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.persistent_mode.md. Accessed: 2023-05-10.
- On the Reliability of Coverage-Based Fuzzer Benchmarking. In International Conference on Software Engineering (ICSE). http://seclab.cs.sunysb.edu/lszekeres/Papers/ICSE22.pdf
- Deep Reinforcement Fuzzing. In IEEE Security and Privacy Workshops (SPW). 116–122. https://arxiv.org/abs/1801.04589
- Rich Caruana. 1997. Multitask Learning. Machine Learning 28 (1997), 41–75. https://doi.org/10.1023/A:1007379606734
- Determination of sample size in using central limit theorem for Weibull distribution. International journal of information and management sciences 17 (2006), 31–46.
- Continuity Analysis of Programs. In ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). 57–70. https://doi.org/10.1145/1706299.1706308
- Swarat Chaudhuri and Armando Solar-Lezama. 2011. Smoothing a Program Soundly and Robustly. In Computer Aided Verification (CAV). https://www.cs.utexas.edu/~swarat/pubs/cav11.pdf
- Peng Chen and Hao Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In IEEE Symposium on Security and Privacy (SP). https://web.cs.ucdavis.edu/~hchen/paper/chen2018angora.pdf
- EnFuzz: Ensemble Fuzzing with Seed Synchronization among Diverse Fuzzers. In USENIX Security Symposium (USENIX Security). https://www.usenix.org/system/files/sec19-chen-yuanliang.pdf
- Albert Danial. 2021. cloc: v1.92. https://doi.org/10.5281/zenodo.5760077
- Poetry developers. 2018. Python Poetry. https://python-poetry.org. Accessed: 2022-10-20.
- William Drozd and Michael D. Wagner. 2018. FuzzerGym: A Competitive Framework for Fuzzing and Learning. CoRR (2018). arXiv:1807.07490 http://arxiv.org/abs/1807.07490
- AFL++ best practices. https://aflplus.plus/docs/fuzzing_in_depth. Accessed: 2022-10-20.
- AFL++ : combining incremental steps of fuzzing research. In USENIX Workshop on Offensive Technologies (WOOT). USENIX Association. https://www.usenix.org/conference/woot20/presentation/fioraldi
- Learn&Fuzz: Machine learning for input fuzzing. In IEEE/ACM International Conference on Automated Software Engineering (ASE). https://doi.org/10.1109/ASE.2017.8115618
- Generative Adversarial Nets. In Advances in Neural Information Processing Systems (NIPS), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf
- Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations (ICLR). https://arxiv.org/abs/1412.6572
- Google. 2017. Fuzzer test suite. https://github.com/google/fuzzer-test-suite. Accessed: 2022-10-20.
- Google. 2022. OSS-Fuzz. https://google.github.io/oss-fuzz/. Accessed: 2022-10-20.
- The elements of statistical learning: data mining, inference and prediction (2 ed.). Springer. http://www-stat.stanford.edu/~tibs/ElemStatLearn/
- GANFuzz: A GAN-Based Industrial Network Protocol Fuzzing Framework. In ACM International Conference on Computing Frontiers. 138–145. https://doi.org/10.1145/3203217.3203241
- Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR). https://arxiv.org/abs/1412.6980
- Evaluating fuzz testing. In ACM SIGSAC Conference on Computer and Communications Security (CCS). 2123–2138.
- Fuzzing: a survey. Cybersecurity 1, 1 (2018), 1–13.
- DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing, In AAAI Conference on Artificial Intelligence (AAAI). AAAI Conference on Artificial Intelligence 33, 1044–1051. https://doi.org/10.1609/aaai.v33i01.33011044
- LLVM. 2022. libFuzzer - a library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html. Accessed: 2022-10-20.
- Robert C. Martin and James O. Coplien. 2009. Clean code: a handbook of agile software craftsmanship. Prentice Hall. https://archive.org/details/cleancodehandboo00mart_843
- Dirk Merkel. 2014. Docker: lightweight linux containers for consistent development and deployment. Linux journal 2014, 239 (2014), 2.
- FuzzBench: An Open Fuzzer Benchmarking Platform and Service. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC). Association for Computing Machinery, New York, NY, USA, 1393–1403. https://doi.org/10.1145/3468264.3473932
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems (NeurIPS), H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8024–8035. http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
- Not all bytes are equal: Neural byte sieve for fuzzing. CoRR (2017). https://arxiv.org/abs/1711.04596
- A Review of Machine Learning Applications in Fuzzing. CoRR (2019). https://arxiv.org/abs/1906.11133
- AddressSanitizer: A Fast Address Sanity Checker. In USENIX Conference on Annual Technical Conference (USENIX ATC). https://dl.acm.org/doi/10.5555/2342821.2342849
- MTFuzz: fuzzing with a multi-task neural network. In ACM Joint European Software Engineering Conference and Symposiumon the Foundations of Software Engineering (ESEC/FSE). https://dl.acm.org/doi/pdf/10.1145/3368089.3409723
- NEUZZ: efficient fuzzing with neural program smoothing. In IEEE Symposium on Security and Privacy (S&P). https://arxiv.org/abs/1807.05620
- Driller: Augmenting Fuzzing Through Selective Symbolic Execution. In NDSS. https://www.ndss-symposium.org/wp-content/uploads/2017/09/driller-augmenting-fuzzing-through-selective-symbolic-execution.pdf
- Sequence to Sequence Learning with Neural Networks. In Advances in Neural Information Processing Systems (NIPS), Vol. 27. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2014/file/a14ac55a4f27472c5d894ec1c3c743d2-Paper.pdf
- Skyfire: Data-Driven Seed Generation for Fuzzing. In IEEE Symposium on Security and Privacy (SP). 579–594. https://doi.org/10.1109/SP.2017.23
- Be Sensitive and Collaborative: Analyzing Impact of Coverage Metrics in Greybox Fuzzing. In International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019). USENIX Association, 1–15.
- A systematic review of fuzzing based on machine learning techniques. CoRR (2019). https://arxiv.org/abs/1908.01262
- One Fuzzing Strategy to Rule Them All. In International Conference on Software Engineering (ICSE). Association for Computing Machinery, New York, NY, USA, 1634–1645. https://doi.org/10.1145/3510003.3510174
- Evaluating and improving neural program-smoothing-based fuzzing. In International Conference on Software Engineering (ICSE). http://zhangyuqun.com/publications/icse2022a.pdf
- Valentin Wüstholz and Maria Christakis. 2020. Targeted Greybox Fuzzing with Static Lookahead Analysis. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. Association for Computing Machinery, New York, NY, USA, 789–800. https://doi.org/10.1145/3377811.3380388
- QSYM: A Practical Concolic Execution Engine Tailored for Hybrid Fuzzing. In USENIX Security Symposium (USENIX Security). USENIX Association, 745–761. https://www.usenix.org/conference/usenixsecurity18/presentation/yun
- Michal Zalewski. 2017. American fuzzy lop. https://github.com/google/AFL.
- Maria-Irina Nicolae (11 papers)
- Max Eisele (4 papers)
- Andreas Zeller (29 papers)