VERT: Verified Equivalent Rust Transpilation with Large Language Models as Few-Shot Learners (2404.18852v2)

Published 29 Apr 2024 in cs.PL and cs.SE

Abstract: Rust is a programming language that combines memory safety and low-level control, providing C-like performance while guaranteeing the absence of undefined behaviors by default. Rust's growing popularity has prompted research on safe and correct transpiling of existing code-bases to Rust. Existing work falls into two categories: rule-based and LLM-based. While rule-based approaches can theoretically produce correct transpilations that maintain input-output equivalence to the original, they often yield unreadable Rust code that uses unsafe subsets of the Rust language. On the other hand, while LLM-based approaches typically produce more readable, maintainable, and safe code, they do not provide any guarantees about correctness. In this work, we present VERT, a tool that can produce readable Rust transpilations with formal guarantees of correctness. VERT's only requirement is that there is Web Assembly compiler for the source language, which is true for most major languages. VERT first uses the Web Assembly compiler to obtain an oracle Rust program. In parallel, VERT uses an LLM to generate a readable candidate Rust program. This candidate is verified against the oracle, and if verification fails, we regenerate a new candidate transpilation until verification succeeds. We evaluate VERT by transpiling a suite of 1,394 programs taken from competitive programming style benchmarks. Combining Anthropic's Claude-2 and VERT increases Rust transpilations passing property-based testing from 31% to 54% and bounded model-checking from 1% to 42% compared to using Claude alone. In addition, we evaluate VERT's ability to generate non-trivial safe Rust on programs taken from real-world C projects that make significant use of pointers. Our results provide insights into the limitations of LLMs to write safe Rust.

PDF Abstract

The paper "VERT: Verified Equivalent Rust Transpilation with LLMs as Few-Shot Learners" addresses the challenge of transpiling code into Rust, a language known for its memory safety and performance akin to C. The key concern in Rust transpilation is achieving input-output equivalence while maintaining code readability and avoiding unsafe code practices.

Key Contributions:

Problem Context:
- Rust's popularity is rising due to its guarantees against undefined behaviors and its performance benefits. As a result, there is significant interest in converting existing codebases into Rust.
- Traditional methods of transpilation are divided into rule-based and LLM-based approaches. Rule-based methods tend to produce correct but unreadable and unsafe Rust code, while LLM-based methods focus on producing cleaner code but lack correctness guarantees.
VERT Approach:
- VERT introduces a novel approach that leverages both LLM-generated candidates and formal verification to ensure code correctness.
- The process begins by compiling the source language into Web Assembly (Wasm), acting as an oracle example for correctness.
- An LLM is simultaneously used to generate a readable Rust candidate. This candidate is compared with the oracle, and verification is used to ensure equivalence.
Verification Process:
- If the candidate generated by the LLM does not initially pass verification against the oracle, VERT regenerates the candidate until a correct transpilation is achieved.
- This iterative process ensures both readability and correctness, providing a significant improvement over previous models.
Evaluation and Results:
- The tool was tested on 1,394 programs from competitive programming benchmarks, showing significant improvements in passing property-based tests and bounded model-checking.
- Specifically, combining the LLM Claude-2 with VERT increased successful Rust transpilations from 31% to 54% (property-based testing) and 1% to 42% (bounded model-checking).
- VERT also demonstrated the capability to handle real-world C programs that heavily utilize pointers, highlighting its aptitude in generating safe, efficient Rust code.
Insights and Limitations:
- The paper provides insights into the current limitations of LLMs in generating safe Rust code and highlights the necessity of incorporating formal verification to bridge gaps in safety and correctness.

Overall, VERT represents a significant advancement in leveraging LLMs for Rust transpilation by introducing a method that prioritizes both the readability and correctness of the code, marking a step forward in safe and efficient code conversion practices.