Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward (2401.03374v2)

Published 7 Jan 2024 in cs.SE and cs.AI

Abstract: In software development, the predominant emphasis on functionality often supersedes security concerns, a trend gaining momentum with AI-driven automation tools like GitHub Copilot. These tools significantly improve developers' efficiency in functional code development. Nevertheless, it remains a notable concern that such tools are also responsible for creating insecure code, predominantly because of pre-training on publicly available repositories with vulnerable code. Moreover, developers are called the "weakest link in the chain" since they have very minimal knowledge of code security. Although existing solutions provide a reasonable solution to vulnerable code, they must adequately describe and educate the developers on code security to ensure that the security issues are not repeated. Therefore we introduce a multipurpose code vulnerability analysis system \texttt{SecRepair}, powered by a LLM, CodeGen2 assisting the developer in identifying and generating fixed code along with a complete description of the vulnerability with a code comment. Our innovative methodology uses a reinforcement learning paradigm to generate code comments augmented by a semantic reward mechanism. Inspired by how humans fix code issues, we propose an instruction-based dataset suitable for vulnerability analysis with LLMs. We further identify zero-day and N-day vulnerabilities in 6 Open Source IoT Operating Systems on GitHub. Our findings underscore that incorporating reinforcement learning coupled with semantic reward augments our model's performance, thereby fortifying its capacity to address code vulnerabilities with improved efficacy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. 2023. Deep TabNine. https://www.tabnine.com/.
  2. https://cwe.mitre.org/. Common Weakness Enumeration.
  3. Towards human-bot collaborative software architecting with chatgpt. In Proceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering, 279–285.
  4. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
  5. Neural transfer learning for repairing security vulnerabilities in c code. IEEE Transactions on Software Engineering, 49(1): 147–165.
  6. CISA. ???? 2023 CWE Top 25 Most Dangerous Software Weaknesses. Accessed: 2023-07-27.
  7. How to avoid making a billion-dollar mistake: Type-safe data plane programming with SafeP4. arXiv preprint arXiv:1906.07223.
  8. Beam search strategies for neural machine translation. arXiv preprint arXiv:1702.01806.
  9. VulRepair: a T5-based automated software vulnerability repair. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 935–947.
  10. GitHub. 2022. GitHub Copilot. https://github.com/features/copilot.
  11. Contextual markov decision processes. arXiv preprint arXiv:1502.02259.
  12. Controlling large language models to generate secure and vulnerable code. arXiv preprint arXiv:2302.05319.
  13. Using safety properties to generate vulnerability patches. In 2019 IEEE Symposium on Security and Privacy (SP), 539–554. IEEE.
  14. ReDeBug: finding unpatched code clones in entire os distributions. In 2012 IEEE Symposium on Security and Privacy, 48–62. IEEE.
  15. Large Language Models and Simple, Stupid Bugs. arXiv preprint arXiv:2303.11455.
  16. Vuddy: A scalable approach for vulnerable code clone discovery. In 2017 IEEE Symposium on Security and Privacy (SP), 595–614. IEEE.
  17. Vuldeelocator: a deep learning-based fine-grained vulnerability detector. IEEE Transactions on Dependable and Secure Computing.
  18. Lin, C.-Y. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out, 74–81.
  19. CodeCompose: A Large-Scale Industrial Deployment of AI-assisted Code Authoring. arXiv preprint arXiv:2305.12050.
  20. Codegen2: Lessons for training llms on programming and natural languages. arXiv preprint arXiv:2305.02309.
  21. NSA. 2022. CISA, FBI, NSA, and International Partners Warn Organizations of Top Routinely Exploited Cybersecurity Vulnerabilities. Accessed: 2023-07-27.
  22. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 311–318.
  23. Asleep at the keyboard? assessing the security of github copilot’s code contributions. In 2022 IEEE Symposium on Security and Privacy (SP), 754–768. IEEE.
  24. Examining Zero-Shot Vulnerability Repair with Large Language Models. In 2023 IEEE Symposium on Security and Privacy (SP), 1–18. IEEE Computer Society.
  25. Automatic Program Repair with OpenAI’s Codex: Evaluating QuixBugs. arXiv preprint arXiv:2111.03922.
  26. Protalinski, E. 2019. accessed: June 2020. Microsoft wants to apply AI to the entire application developer lifecycle. https://venturebeat.com/2019/05/20/ microsoft-wants-to-apply-ai-to-the-entireapplication-developer-lifecycle/,.
  27. Zero: Memory optimizations toward training trillion parameter models. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, 1–16. IEEE.
  28. The programmer’s assistant: Conversational interaction with a large language model for software development. In Proceedings of the 28th International Conference on Intelligent User Interfaces, 491–514.
  29. Security Implications of Large Language Model Code Assistants: A User Study. arXiv preprint arXiv:2208.09727.
  30. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
  31. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 8696–8708. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics.
  32. WhiteHouse. 2023. Executive Order on Improving the Nation’s Cybersecurity. Accessed: 2023-07-27.
  33. {{\{{MOVERY}}\}}: A Precise Approach for Modified Vulnerable Code Clone Discovery from Modified {{\{{Open-Source}}\}} Software Components. In 31st USENIX Security Symposium (USENIX Security 22), 3037–3053.
  34. {{\{{MVP}}\}}: Detecting Vulnerabilities using {{\{{Patch-Enhanced}}\}} Vulnerability Signatures. In 29th USENIX Security Symposium (USENIX Security 20), 1165–1182.
  35. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675.
  36. Program vulnerability repair via inductive inference. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, 691–702.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Nafis Tanveer Islam (8 papers)
  2. Joseph Khoury (5 papers)
  3. Andrew Seong (2 papers)
  4. Gonzalo De La Torre Parra (3 papers)
  5. Elias Bou-Harb (15 papers)
  6. Peyman Najafirad (33 papers)
  7. Mohammad Bahrami Karkevandi (4 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com