Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LLM Agents can Autonomously Exploit One-day Vulnerabilities (2404.08144v2)

Published 11 Apr 2024 in cs.CR and cs.AI

Abstract: LLMs have becoming increasingly powerful, both in their benign and malicious uses. With the increase in capabilities, researchers have been increasingly interested in their ability to exploit cybersecurity vulnerabilities. In particular, recent work has conducted preliminary studies on the ability of LLM agents to autonomously hack websites. However, these studies are limited to simple vulnerabilities. In this work, we show that LLM agents can autonomously exploit one-day vulnerabilities in real-world systems. To show this, we collected a dataset of 15 one-day vulnerabilities that include ones categorized as critical severity in the CVE description. When given the CVE description, GPT-4 is capable of exploiting 87% of these vulnerabilities compared to 0% for every other model we test (GPT-3.5, open-source LLMs) and open-source vulnerability scanners (ZAP and Metasploit). Fortunately, our GPT-4 agent requires the CVE description for high performance: without the description, GPT-4 can exploit only 7% of the vulnerabilities. Our findings raise questions around the widespread deployment of highly capable LLM agents.

The paper you referred to, "LLM Agents can Autonomously Exploit One-day Vulnerabilities" by Richard Fang et al., investigates the capability of LLM agents to autonomously exploit cybersecurity vulnerabilities, particularly one-day vulnerabilities in real-world systems. Here's a detailed synthesis of the paper:

Abstract and Objectives

The authors explore the potential of LLM agents, with a focus on GPT-4, to exploit one-day vulnerabilities, which are vulnerabilities that have been disclosed but not yet patched. The paper collected a dataset of 15 one-day vulnerabilities, aiming to test the hypothesis that LLMs, particularly GPT-4, can autonomously exploit these real-world vulnerabilities at a significant success rate.

Key Findings

  1. Exploitation Success: GPT-4 was able to exploit 87% of the tested one-day vulnerabilities when provided with the CVE description. In contrast, other models like GPT-3.5 and several open-source LLMs achieved a 0% success rate, as did open-source vulnerability scanners such as ZAP and Metasploit.
  2. Importance of CVE Descriptions: The presence of CVE descriptions is critical for success. Without them, GPT-4's ability to exploit vulnerabilities drops significantly to 7%, highlighting that identifying vulnerabilities is more challenging than exploiting known ones.
  3. Capabilities of LLM Agents: The paper demonstrates that LLM agents can function autonomously, using toolsets to navigate and interact with environments to exploit vulnerabilities. The paper confirms the capability of LLMs to engage in complex actions required for non-trivial cybersecurity tasks.
  4. Scalability and Cost Efficiency: Using an LLM like GPT-4 for such tasks is cheaper than employing human cybersecurity experts. The paper estimates the cost of using GPT-4 for exploiting each vulnerability at approximately $8.80, compared to$25 for half an hour of human labor.

Methodology

  • Dataset Creation: The authors curated a benchmark of 15 real-world one-day vulnerabilities from open sources, focusing on those that could be reproduced in a sandboxed environment.
  • Agent Framework: They implemented the ReAct agent framework and provided tools that LLMs need, such as web browsing capabilities, a terminal interface, and a code interpreter.
  • Evaluation Protocol: The evaluation involved measuring the success rate (pass at 5 and pass at 1) and cost efficiency of using GPT-4 to exploit these vulnerabilities.

Discussion and Implications

The findings of this paper suggest that while GPT-4 exhibits strong capabilities in exploiting known vulnerabilities, its capacity to discover new ones autonomously remains limited. This differentiation is crucial in understanding the role of LLM agents in cybersecurity, highlighting their potential value in automating defensive measures rather than solely offensive actions.

Ethical Considerations

The paper discusses the moral implications of using LLMs for cybersecurity, emphasizing that although these technologies can be used for malicious purposes, they also hold significant potential for automating threat detection and improving security measures.

Conclusion

The research underscores the capability of GPT-4 in specific exploitative tasks within cybersecurity, suggesting a need for careful management of such tools to prevent misuse while leveraging their strengths in enhancing cybersecurity defenses.

This paper highlights the cutting-edge potential and limitations of LLMs in cybersecurity scenarios, providing a critical evaluation of GPT-4's application in real-world vulnerability exploitation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. The social and psychological impact of cyberattacks. In Emerging cyber threats and cognitive vulnerabilities, pp. 73–92. Elsevier, 2020.
  3. Simon Bennetts. Owasp zed attack proxy. AppSec USA, 2013.
  4. Emergent autonomous scientific research capabilities of large language models. arXiv preprint arXiv:2304.05332, 2023.
  5. Augmenting large language models with chemistry tools. In NeurIPS 2023 AI for Science Workshop, 2023.
  6. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  7. Patrick Engebretson. The basics of hacking and penetration testing: ethical hacking and penetration testing made easy. Elsevier, 2013.
  8. Llm agents can autonomously hack websites, 2024.
  9. More than you’ve asked for: A comprehensive analysis of novel prompt injection threats to application-integrated large language models. arXiv e-prints, pp.  arXiv–2302, 2023a.
  10. Not what you’ve signed up for: Compromising real-world llm-integrated applications with indirect prompt injection. In Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security, pp.  79–90, 2023b.
  11. A classification of sql-injection attacks and countermeasures. In Proceedings of the IEEE international symposium on secure software engineering, volume 1, pp.  13–15. IEEE Piscataway, NJ, 2006.
  12. Machine learning in cybersecurity: A review. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(4):e1306, 2019.
  13. Getting pwn’d by ai: Penetration testing with large language models. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp.  2082–2086, 2023.
  14. A research agenda acknowledging the persistence of passwords. IEEE Security & privacy, 10(1):28–36, 2011.
  15. Generative ai for pentesting: the good, the bad, the ugly. International Journal of Information Security, pp.  1–23, 2024.
  16. Agentcoder: Multi-agent-based code generation with iterative testing and optimisation. arXiv preprint arXiv:2312.13010, 2023.
  17. A survey of emerging threats in cybersecurity. Journal of computer and system sciences, 80(5):973–993, 2014.
  18. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.
  19. Mixtral of experts. arXiv preprint arXiv:2401.04088, 2024.
  20. Swe-bench: Can language models resolve real-world github issues? arXiv preprint arXiv:2310.06770, 2023.
  21. Exploiting programmatic behavior of llms: Dual-use through standard security attacks. arXiv preprint arXiv:2302.05733, 2023.
  22. Metasploit: the penetration tester’s guide. No Starch Press, 2011.
  23. Operation triangulation: ios devices targeted with previously unknown malware. 2023. URL https://securelist.com/operation-triangulation/109842/.
  24. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 33:9459–9474, 2020.
  25. Will ai make cyber swords or shields? 2022.
  26. Akash Mahajan. Burp Suite Essentials. Packt Publishing Ltd, 2014.
  27. Augmented language models: a survey. arXiv preprint arXiv:2302.07842, 2023.
  28. Anton Osika. gpt-engineer, April 2023. URL https://github.com/gpt-engineer-org/gpt-engineer.
  29. Evaluating frontier models for dangerous capabilities. arXiv preprint arXiv:2403.13793, 2024.
  30. Nathaniel Popper. A hacking of more than $50 million dashes hopes in the world of virtual currency. The New York Times, 17, 2016.
  31. Fine-tuning aligned language models compromises safety, even when users do not intend to! arXiv preprint arXiv:2310.03693, 2023.
  32. Nous Research. Nous hermes 2 - yi-34b, 2024. URL https://huggingface.co/NousResearch/Nous-Hermes-2-Yi-34B.
  33. Exploiting the remote server access support of coap protocol. IEEE Internet of Things Journal, 6(6):9338–9349, 2019.
  34. Automated vulnerability detection in source code using deep representation learning. In 2018 17th IEEE international conference on machine learning and applications (ICMLA), pp.  757–762. IEEE, 2018.
  35. Are emergent abilities of large language models a mirage? Advances in Neural Information Processing Systems, 36, 2024.
  36. Toolformer: Language models can teach themselves to use tools. arXiv preprint arXiv:2302.04761, 2023.
  37. Practical malware analysis: the hands-on guide to dissecting malicious software. no starch press, 2012.
  38. Teknium. Openhermes 2.5 - mistral 7b, 2024. URL https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B.
  39. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
  40. Data exfiltration: A review of external attack vectors and countermeasures. Journal of Network and Computer Applications, 101:18–54, 2018.
  41. Tanay Varshney. Introduction to llm agents. 2023. URL https://developer.nvidia.com/blog/introduction-to-llm-agents/.
  42. Common Vulnerabilities. Common vulnerabilities and exposures. The MITRE Corporation,[online] Available: https://cve. mitre. org/index. html, 2005.
  43. Openchat: Advancing open-source language models with mixed-quality data. arXiv preprint arXiv:2309.11235, 2023.
  44. Tdag: A multi-agent framework based on dynamic task decomposition and agent generation. arXiv preprint arXiv:2402.10178, 2024.
  45. Acidrain: Concurrency-related attacks on database-backed web applications. In Proceedings of the 2017 ACM International Conference on Management of Data, pp.  5–20, 2017.
  46. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682, 2022.
  47. Shadow alignment: The ease of subverting safely-aligned language models. arXiv preprint arXiv:2310.02949, 2023.
  48. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2022.
  49. Benchmarking and defending against indirect prompt injection attacks on large language models. arXiv preprint arXiv:2312.14197, 2023.
  50. Removing rlhf protections in gpt-4 via fine-tuning. arXiv preprint arXiv:2311.05553, 2023.
  51. Injecagent: Benchmarking indirect prompt injections in tool-integrated large language model agents. arXiv preprint arXiv:2403.02691, 2024.
  52. Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems, 36, 2024.
  53. Path sensitive static analysis of web applications for remote code execution vulnerability detection. In 2013 35th International Conference on Software Engineering (ICSE), pp.  652–661. IEEE, 2013.
  54. Universal and transferable adversarial attacks on aligned language models. arXiv preprint arXiv:2307.15043, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Richard Fang (8 papers)
  2. Rohan Bindu (4 papers)
  3. Akul Gupta (5 papers)
  4. Daniel Kang (41 papers)
Citations (33)
Youtube Logo Streamline Icon: https://streamlinehq.com
Reddit Logo Streamline Icon: https://streamlinehq.com