Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Fuzzing Meets LLMs: Challenges and Opportunities (2404.16297v1)

Published 25 Apr 2024 in cs.SE and cs.AI

Abstract: Fuzzing, a widely-used technique for bug detection, has seen advancements through LLMs. Despite their potential, LLMs face specific challenges in fuzzing. In this paper, we identified five major challenges of LLM-assisted fuzzing. To support our findings, we revisited the most papers from top-tier conferences, confirming that these challenges are widespread. As a remedy, we propose some actionable recommendations to help improve applying LLM in Fuzzing and conduct preliminary evaluations on DBMS fuzzing. The results demonstrate that our recommendations effectively address the identified challenges.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. 2023. aws-mysql-jdbc. https://github.com/awslabs/aws-mysql-jdbc. Accessed: April 25, 2024.
  2. 2023. mariadb-connector-j. https://github.com/mariadb-corporation/mariadb-connector-j. Accessed: April 25, 2024.
  3. 2023. mysql-connector-j. https://github.com/mysql/mysql-connector-j. Accessed: April 25, 2024.
  4. 2023. Rfc854. https://datatracker.ietf.org/doc/html/rfc854. Accessed: April 25, 2024.
  5. Joshua Ackerman and George Cybenko. 2023. Large Language Models for Fuzzing Parsers (Registered Report). In Proceedings of the 2nd International Fuzzing Workshop. 31–38.
  6. MonetDB B.V. 2023. MonetDB Website. https://www.monetdb.org. Accessed: April 25, 2024.
  7. A survey on evaluation of large language models. arXiv preprint arXiv:2307.03109 (2023).
  8. A systematic review of fuzzing techniques. Computers & Security 75 (2018), 118–137.
  9. Effective test generation using pre-trained large language models and mutation testing. arXiv preprint arXiv:2308.16557 (2023).
  10. Victor Dantas. 2023. Large Language Model Powered Test Case Generation for Software Applications. (2023).
  11. Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models. In Proceedings of the 32nd ACM SIGSOFT international symposium on software testing and analysis. 423–435.
  12. DuckDB. 2023. DuckDB WebSite. https://www.duckdb.org/. Accessed: April 25, 2024.
  13. Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects. Authorea Preprints (2023).
  14. Large language models for software engineering: A systematic literature review. arXiv preprint arXiv:2308.10620 (2023).
  15. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:2311.05232 (2023).
  16. Automated Bug Generation in the era of Large Language Models. arXiv preprint arXiv:2310.02407 (2023).
  17. ClickHouse Inc. 2023. ClickHouse Website. https://clickhouse.com. Accessed: April 25, 2024.
  18. Benchmarking and Explaining Large Language Model-based Code Generation: A Causality-Centric Approach. arXiv preprint arXiv:2310.06680 (2023).
  19. Challenges and applications of large language models. arXiv preprint arXiv:2307.10169 (2023).
  20. Siva Kesava Reddy Kakarla and Ryan Beckett. 2023. Oracle-based Protocol Testing with Eywa. arXiv preprint arXiv:2312.06875 (2023).
  21. Prateek Kumar and Sanjay Kathuria. 2023. Large language models (LLMs) for natural language processing (NLP) of oil and gas drilling data. In SPE Annual Technical Conference and Exhibition? SPE, D021S012R004.
  22. Hallucinations in neural machine translation. (2018).
  23. Large Language Model-Aware In-Context Learning for Code Generation. arXiv preprint arXiv:2310.09748 (2023).
  24. Nuances are the Key: Unlocking ChatGPT to Find Failure-Inducing Tests with Differential Prompting. In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 14–26.
  25. Fuzzing: State of the art. IEEE Transactions on Reliability 67, 3 (2018), 1199–1218.
  26. Evaluating large language models for radiology natural language processing. arXiv preprint arXiv:2307.13693 (2023).
  27. Prompt Fuzzing for Fuzz Driver Generation. arXiv preprint arXiv:2312.17677 (2023).
  28. Large Language Model guided Protocol Fuzzing. In Proceedings of the 31st Annual Network and Distributed System Security Symposium (NDSS).
  29. MITRE. 2021. CVE-2021-40523. (2021).
  30. Object hallucination in image captioning. arXiv preprint arXiv:1809.02156 (2018).
  31. ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding. arXiv preprint arXiv:2305.14196 (2023).
  32. {{\{{KSG}}\}}: Augmenting Kernel Fuzzing with System Call Specification Generation. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). 351–366.
  33. Utilizing Large Language Models for Fuzzing: A Novel Deep Learning Approach to Seed Generation. (2023).
  34. Google Open Source Security Team. [n. d.]. AI-Powered Fuzzing: Breaking the Bug Hunting Barrier. https://security.googleblog.com/2023/08/ai-powered-fuzzing-breaking-bug-hunting.html. Accessed: April 25, 2024.
  35. Universal fuzzing via large language models. arXiv preprint arXiv:2308.04748 (2023).
  36. White-box compiler fuzzing empowered by large language models. arXiv preprint arXiv:2310.15991 (2023).
  37. KernelGPT: Enhanced Kernel Fuzzing via Large Language Models. arXiv preprint arXiv:2401.00563 (2023).
  38. Understanding large language model based fuzz driver generation. arXiv preprint arXiv:2307.12469 (2023).
Citations (2)

Summary

We haven't generated a summary for this paper yet.