Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing (2403.03897v1)

Published 6 Mar 2024 in cs.SE and cs.CR

Abstract: BusyBox, an open-source software bundling over 300 essential Linux commands into a single executable, is ubiquitous in Linux-based embedded devices. Vulnerabilities in BusyBox can have far-reaching consequences, affecting a wide array of devices. This research, driven by the extensive use of BusyBox, delved into its analysis. The study revealed the prevalence of older BusyBox versions in real-world embedded products, prompting us to conduct fuzz testing on BusyBox. Fuzzing, a pivotal software testing method, aims to induce crashes that are subsequently scrutinized to uncover vulnerabilities. Within this study, we introduce two techniques to fortify software testing. The first technique enhances fuzzing by leveraging LLMs (LLM) to generate target-specific initial seeds. Our study showed a substantial increase in crashes when using LLM-generated initial seeds, highlighting the potential of LLM to efficiently tackle the typically labor-intensive task of generating target-specific initial seeds. The second technique involves repurposing previously acquired crash data from similar fuzzed targets before initiating fuzzing on a new target. This approach streamlines the time-consuming fuzz testing process by providing crash data directly to the new target before commencing fuzzing. We successfully identified crashes in the latest BusyBox target without conducting traditional fuzzing, emphasizing the effectiveness of LLM and crash reuse techniques in enhancing software testing and improving vulnerability detection in embedded systems. Additionally, manual triaging was performed to identify the nature of crashes in the latest BusyBox.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. IoT Analytics. State of IoT 2023: Number of connected IoT devices growing 16% to 16.7 billion globally.
  2. NAUTILUS: Fishing for Deep Bugs with Grammars. In Proceedings 2019 Network and Distributed System Security Symposium, San Diego, CA, 2019. Internet Society.
  3. Fabrice Bellard. QEMU, a fast and portable dynamic translator. In Proceedings of the Annual Conference on USENIX Annual Technical Conference, ATEC ’05, page 41, Anaheim, CA, 2005. USENIX Association.
  4. GRIMOIRE: Synthesizing Structure while Fuzzing. In 28th USENIX Security Symposium (USENIX Security 19), pages 1985–2002, Santa Clara, CA, August 2019. USENIX Association.
  5. Coverage-Based Greybox Fuzzing as Markov Chain. IEEE Transactions on Software Engineering, 45(5):489–506, May 2019.
  6. Lucian Constantin. BusyBox flaws highlight need for consistent IoT updates, September 2021.
  7. CVEdetails. Busybox : Security vulnerabilities.
  8. Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2023, pages 423–435, New York, NY, USA, July 2023. Association for Computing Machinery.
  9. Large language models are edge-case fuzzers: Testing deep learning libraries via fuzzgpt.
  10. AflIot: Fuzzing on linux-based IoT device with binary-level instrumentation. Computers & Security, 122:102889, 2022.
  11. AFL++: Combining incremental steps of fuzzing research. In 14th USENIX Workshop on Offensive Technologies (WOOT 20), page 12. USENIX Association, 2020-08, 2020.
  12. GNU. GDB: The GNU project debugger.
  13. Learn&Fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 50–59, Urbana, IL, October 2017. IEEE.
  14. Google. OSS fuzz.
  15. CLIFuzzer: Mining grammars for command-line invocations. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2022, pages 1667–1671, New York, NY, USA, November 2022. Association for Computing Machinery.
  16. Grant Hernandez. AFLTriage.
  17. Augmenting Greybox Fuzzing with Generative AI, June 2023.
  18. Large Language Models Based Fuzzing Techniques: A Survey, February 2024.
  19. ECMO: Peripheral Transplantation to Rehost Embedded Linux Kernels. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS ’21, pages 734–748, New York, NY, USA, November 2021. Association for Computing Machinery.
  20. Artificial intelligence for cybersecurity: Literature review and future research directions. Information Fusion, 97:101804, 2023.
  21. AI-Powered fuzzing: Breaking the bug hunting barrier.
  22. FirmGuide: Boosting the Capability of Rehosting Embedded Linux Kernels through Model-Guided Kernel Execution. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 792–804, November 2021.
  23. Testing the Limits: Unusual Text Inputs Generation for Mobile App Crash Detection with Large Language Model, October 2023.
  24. Refuzz: A remedy for saturation in coverage-guided fuzzing. Electronics, 10(16), 2021.
  25. Large Language Model guided Protocol Fuzzing. In Proceedings 2024 Network and Distributed System Security Symposium, San Diego, CA, USA, 2024. Internet Society.
  26. Unboxing BusyBox – 14 new vulnerabilities uncovered by Claroty and JFrog, September 2021.
  27. An empirical study of the robustness of MacOS applications using random testing. Operating Systems Review, 41:78–86, January 2007.
  28. An Empirical Study of the Reliability of UNIX Utilities. Commun. ACM, 33:32–44, December 1990.
  29. Fuzz Revisited: A Re-Examination of the Reliability of UNIX Utilities and Services. January 1998.
  30. The Relevance of Classic Fuzz Testing: Have We Solved This One? IEEE Transactions on Software Engineering, 48(6):2028–2039, June 2022.
  31. Mitre. CWE-78: Improper neutralization of special elements used in an OS command.
  32. What you corrupt is not what you crash: Challenges in fuzzing embedded devices. In Proceedings 2018 Network and Distributed System Security Symposium, San Diego, CA, 2018. Internet Society.
  33. netrise.io. Netrise | firmware security.
  34. NSA. Ghidra github repository, 2019.
  35. OpenAI. OpenAI API.
  36. OpenAI. GPT-4 technical report. ArXiv, abs/2303.08774, 2023.
  37. OWASP. Buffer overflow attack.
  38. Smart Greybox Fuzzing. IEEE Transactions on Software Engineering, 47(9):1980–1997, September 2021.
  39. CHEMFUZZ: Large Language Models-assisted Fuzzing for Quantum Chemistry Software Bug Detection. In 23rd IEEE International Conference on Software Quality, Reliability, and Security. Accompany., Chiang Mai, Thailand, October 2023. IEEE.
  40. {Red Hat Product Security}. CVE-2010-4051, October 2010.
  41. Fuzzing the out-of-memory killer on embedded Linux: An adaptive random approach. In Proceedings of the 2011 ACM Symposium on Applied Computing, SAC ’11, pages 387–392, New York, NY, USA, March 2011. Association for Computing Machinery.
  42. CrFuzz: Fuzzing multi-purpose programs through input validation. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2020, pages 690–700, New York, NY, USA, November 2020. Association for Computing Machinery.
  43. FirmFuzz: Automated IoT Firmware Introspection and Analysis. In Proceedings of the 2nd International ACM Workshop on Security and Privacy for the Internet-of-Things, IoT S&P’19, pages 15–21, New York, NY, USA, November 2019. Association for Computing Machinery.
  44. CarpetFuzz: Automatic Program Option Constraint Extraction from Documentation for Fuzzing. In 32nd USENIX Security Symposium (USENIX Security 23), pages 1919–1936, Anaheim, CA, August 2023. USENIX Association.
  45. Skyfire: Data-Driven Seed Generation for Fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP), pages 579–594, May 2017.
  46. Superion: Grammar-Aware Greybox Fuzzing. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pages 724–735, May 2019.
  47. Nicholas Wells. BusyBox: A swiss army knife for linux. Linux J., 2000(78es):10–es, October 2000.
  48. Fuzz4All: Universal Fuzzing with Large Language Models, January 2024.
  49. ParaFuzz: An Interpretability-Driven Technique for Detecting Poisoned Samples in NLP, October 2023.
  50. White-box Compiler Fuzzing Empowered by Large Language Models, October 2023.
  51. Michal Zalewski. American fuzzy lop - a security-oriented fuzzer.
  52. Understanding Large Language Model Based Fuzz Driver Generation, August 2023.
  53. Fuzzing Configurations of Program Options. ACM Transactions on Software Engineering and Methodology, 32(2):53:1–53:21, March 2023.
  54. Efficient greybox fuzzing of applications in linux-based IoT devices via enhanced user-mode emulation. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2022, pages 417–428, New York, NY, USA, 2022. Association for Computing Machinery.
  55. Regression Greybox Fuzzing. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, CCS ’21, pages 2169–2182, New York, NY, USA, November 2021. Association for Computing Machinery.
  56. CSI-Fuzz: Full-Speed Edge Tracing Using Coverage Sensitive Instrumentation. IEEE Transactions on Dependable and Secure Computing, 19(2):912–923, March 2022.
Citations (5)

Summary

We haven't generated a summary for this paper yet.