CURE: Code-Aware Neural Machine Translation for Automatic Program Repair

Published 26 Feb 2021 in cs.SE, cs.AI, and cs.LG | (2103.00073v4)

Abstract: Automatic program repair (APR) is crucial to improve software reliability. Recently, neural machine translation (NMT) techniques have been used to fix software bugs automatically. While promising, these approaches have two major limitations. Their search space often does not contain the correct fix, and their search strategy ignores software knowledge such as strict code syntax. Due to these limitations, existing NMT-based techniques underperform the best template-based approaches. We propose CURE, a new NMT-based APR technique with three major novelties. First, CURE pre-trains a programming language (PL) model on a large software codebase to learn developer-like source code before the APR task. Second, CURE designs a new code-aware search strategy that finds more correct fixes by focusing on compilable patches and patches that are close in length to the buggy code. Finally, CURE uses a subword tokenization technique to generate a smaller search space that contains more correct fixes. Our evaluation on two widely-used benchmarks shows that CURE correctly fixes 57 Defects4J bugs and 26 QuixBugs bugs, outperforming all existing APR techniques on both benchmarks.

Abstract PDF Upgrade to Chat

Citations (255)

View on Semantic Scholar

Summary

The paper introduces a novel approach that pre-trains a programming language model to enhance neural machine translation for automatic program repair.
It employs a code-aware search strategy with valid-identifier checks and length-control to generate semantically and syntactically correct patches.
Empirical results on Defects4J and QuixBugs benchmarks show that CURE fixes more bugs compared to existing APR techniques.

Insights into "CURE: Code-Aware Neural Machine Translation for Automatic Program Repair"

The study introduces a novel approach to Automatic Program Repair (APR) titled "CURE," which leverages neural machine translation (NMT). The authors address the limitations of current NMT-based APR techniques, which struggle due to expansive search spaces lacking correct fixes and the disregard for code-specific knowledge, especially the syntax constraints inherent in programming languages.

Key Contributions

Pre-training with a Programming LLM: Central to the CURE approach is the pre-training of a programming language (PL) model on extensive code repositories. This step allows the system to internalize developer-like semantics and syntax before engaging in APR tasks, setting a distinct separation between code comprehension and patch learning. The Generative Pre-trained Transformer (GPT) architecture is utilized, marking a shift towards incorporating advanced LLMs in APR systems.
Code-Aware Search Strategy: CURE incorporates a code-specific search strategy, emphasizing the generation of compilable patches and maintaining a close length correlation with buggy code. This methodology is divided into the valid-identifier-checking strategy and a length-control mechanism, significantly refining the search process by prioritizing semantically and syntactically plausible candidates.
Subword Tokenization: By employing byte pair encoding (BPE), CURE manages vocabulary size efficiently, addressing the out-of-vocabulary (OOV) challenge prevalent in code snippets. This approach reduces the search space while simultaneously enhancing the likelihood of including correct fixes.

Results and Discussion

The evaluation of CURE on the Defects4J and QuixBugs benchmarks demonstrates its superior performance, fixing 57 and 26 bugs respectively, and outperforming existing APR techniques, both NMT-based and pattern-based. This is particularly significant in Defects4J, where CURE surpasses the performance of predecessors like CoCoNuT, indicating the efficacy of its underlying methodology.

CURE's integration of a pre-trained model and enhanced search strategies not only results in a higher rate of compilable patches but also accelerates the discovery of correct patches. The smaller yet better-curated search space further underscores the effectiveness of their approach in tackling the challenges associated with program repair.

Implications and Future Directions

The proposition of CURE implies a paradigm shift in APR, suggesting that code-aware NMT models, when empowered by pre-trained PL models and strategic code-specific search mechanisms, can significantly uplift the performance of automatic bug-fixing solutions. This methodology highlights a nuanced approach where deep learning models are judiciously tailored to accommodate the structural and syntactical confines of programming languages.

Looking forward, the application of similar strategies could be extended to other languages and domains within software engineering, potentially addressing tasks such as code synthesis or optimization. This research opens avenues for developing more sophisticated, domain-specific machine translation systems that are well-equipped to handle the intricacies of software development and maintenance tasks.

In conclusion, CURE sets a commendable precedent for integrating advanced machine learning models in APR, offering both a theoretically and practically robust framework for improving software reliability through automatic bug fixes.

Markdown