Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Break-It-Fix-It: Unsupervised Learning for Program Repair (2106.06600v2)

Published 11 Jun 2021 in cs.LG, cs.CL, and cs.SE

Abstract: We consider repair tasks: given a critic (e.g., compiler) that assesses the quality of an input, the goal is to train a fixer that converts a bad example (e.g., code with syntax errors) into a good one (e.g., code with no syntax errors). Existing works create training data consisting of (bad, good) pairs by corrupting good examples using heuristics (e.g., dropping tokens). However, fixers trained on this synthetically-generated data do not extrapolate well to the real distribution of bad inputs. To bridge this gap, we propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas: (i) we use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data, and (ii) we train a breaker to generate realistic bad code from good code. Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data. We evaluate BIFI on two code repair datasets: GitHub-Python, a new dataset we introduce where the goal is to repair Python code with AST parse errors; and DeepFix, where the goal is to repair C code with compiler errors. BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python (+28.5%) and 71.7% on DeepFix (+5.6%). Notably, BIFI does not require any labeled data; we hope it will be a strong starting point for unsupervised learning of various repair tasks.

Citations (100)

Summary

  • The paper introduces the BIFI algorithm that unsupervisedly repairs code by integrating a breaker to mimic genuine errors and a fixer for correction.
  • It leverages a critic to validate repairs, ensuring iterative improvement and significant performance gains on both Python and C datasets.
  • BIFI achieves 90.5% repair accuracy on GitHub-Python and 71.7% on DeepFix, demonstrating robust performance without labeled training data.

Break-It-Fix-It: Unsupervised Learning for Program Repair

The paper "Break-It-Fix-It: Unsupervised Learning for Program Repair" introduces an innovative approach to automatic code repair, a process wherein defective code is converted into functional code without requiring manually labeled data. Traditional methods often leverage synthetic data by introducing errors in otherwise correct code, but these methods struggle with the distribution mismatch between synthesized and actual erroneous data. To overcome this shortfall, the authors propose the Break-It-Fix-It (BIFI) algorithm, which consists of dual components: a breaker that generates realistic bad code from good examples, and a fixer that learns to repair these errors. Both components are iteratively refined using a critic—a code analyzer or compiler—that assesses correctness.

The BIFI approach is evaluated on two datasets: a novel dataset, GitHub-Python, targeting Python code with AST parse errors, and DeepFix, focusing on C code with compiler errors. Notably, the BIFI method achieves remarkable performance improvements over existing state-of-the-art methods, boasting a repair accuracy of 90.5% on GitHub-Python and 71.7% on DeepFix—corresponding to enhancements of +28.5% and +5.6% respectively.

The paper emphasizes that BIFI does not require any labeled training data, which could significantly broaden the applicability of machine learning to varying repair tasks across domains. By employing unsupervised learning techniques that refine their understanding of "bad" and "fixed" code samples across iterations, the BIFI algorithm adeptly aligns its model on more realistic distributions of erroneous data.

Core to the method are two principles: the utilization of a critic to validate fixed outputs and the development of a breaker to simulate human-like code errors. Not only does BIFI alleviate the distribution mismatch typically produced through synthetic perturbations, but it also facilitates the iterative enhancement of both the breaker and fixer. For instance, the breaker becomes progressively skilled at mimicking genuine error patterns seen in human-written code, while the fixer repeatedly becomes more adept at addressing these nuanced errors.

The implication of BIFI are twofold. Practically, it offers an automatic and robust method for program repair, potentially enhancing programming productivity by reducing manual debugging efforts. Theoretically, it expands the horizons of unsupervised learning in the context of program synthesis and error correction, presenting exciting possibilities for future AI advancements.

In conclusion, Break-It-Fix-It presents a compelling unsupervised learning framework for program repair, marrying the roles of breaker and fixer with a critic’s evaluative capacity to adaptively learn from real-world code errors. This methodology illuminates a path forward not only in program repair but also for broader applications needing no reliance on labeled datasets, encompassing areas such as molecular design and neural essay editing. As AI continues to evolve, methodologies like BIFI could fundamentally transform how learning models interact with complex, domain-specific tasks.

Youtube Logo Streamline Icon: https://streamlinehq.com