Automated Repair of AI Code with Large Language Models and Formal Verification (2405.08848v1)

Published 14 May 2024 in cs.SE and cs.AI

Abstract: The next generation of AI systems requires strong safety guarantees. This report looks at the software implementation of neural networks and related memory safety properties, including NULL pointer deference, out-of-bound access, double-free, and memory leaks. Our goal is to detect these vulnerabilities, and automatically repair them with the help of LLMs. To this end, we first expand the size of NeuroCodeBench, an existing dataset of neural network code, to about 81k programs via an automated process of program mutation. Then, we verify the memory safety of the mutated neural network implementations with ESBMC, a state-of-the-art software verifier. Whenever ESBMC spots a vulnerability, we invoke a LLM to repair the source code. For the latest task, we compare the performance of various state-of-the-art prompt engineering techniques, and an iterative approach that repeatedly calls the LLM.

References (40)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/ComputerPapers/status/1790987509993714142

https://twitter.com/gastronomy/status/1790957511736017192

Automated Repair of AI Code with Large Language Models and Formal Verification (2405.08848v1)

Summary

Related Papers

Tweets