Graph-based, Self-Supervised Program Repair from Diagnostic Feedback (2005.10636v2)

Published 20 May 2020 in cs.SE, cs.CL, cs.LG, cs.PL, and stat.ML

Abstract: We consider the problem of learning to repair programs from diagnostic feedback (e.g., compiler error messages). Program repair is challenging for two reasons: First, it requires reasoning and tracking symbols across source code and diagnostic feedback. Second, labeled datasets available for program repair are relatively small. In this work, we propose novel solutions to these two challenges. First, we introduce a program-feedback graph, which connects symbols relevant to program repair in source code and diagnostic feedback, and then apply a graph neural network on top to model the reasoning process. Second, we present a self-supervised learning paradigm for program repair that leverages unlabeled programs available online to create a large amount of extra program repair examples, which we use to pre-train our models. We evaluate our proposed approach on two applications: correcting introductory programming assignments (DeepFix dataset) and correcting the outputs of program synthesis (SPoC dataset). Our final system, DrRepair, significantly outperforms prior work, achieving 68.2% full repair rate on DeepFix (+22.9% over the prior best), and 48.4% synthesis success rate on SPoC (+3.7% over the prior best).

PDF Abstract

Analyzing "Graph-based, Self-Supervised Program Repair from Diagnostic Feedback"

The paper by Michihiro Yasunaga and Percy Liang from Stanford University presents a novel approach to program repair by leveraging graph-based modeling and self-supervised learning techniques. This essay provides an expert analysis of the methodologies and results presented in the paper, focusing on the two key innovations: the program-feedback graph and the self-supervised learning paradigm.

Program-Feedback Graph

One of the central contributions of the paper is the introduction of the program-feedback graph. This graph is designed to capture the semantic relationships between symbols in source code and diagnostic feedback, such as compiler error messages. By representing both the code and the feedback in a unified graphical structure, the authors aim to enhance the reasoning capacity of the repair system. This is realized through the application of graph neural networks (GNNs), which offer a sophisticated mechanism for symbol tracking across various lines of code and feedback. Unlike previous approaches that relied heavily on sequence-to-sequence models or Abstract Syntax Tree (AST) representations, the program-feedback graph directly connects relevant symbols, facilitating efficient information flow.

The empirical results demonstrate the benefits of this graph-based approach, particularly for errors that necessitate reasoning over multiple lines of code. On the DeepFix dataset, the incorporation of the program-feedback graph led to a significant improvement in full repair rates, suggesting that the graph structure effectively captures the dependencies and long-range interactions necessary for resolving complex programming errors.

Self-Supervised Learning Paradigm

Addressing the challenge of limited labeled data for program repair, the paper proposes a self-supervised learning framework that utilizes unlabeled programs freely available online to generate synthetic training data. This is achieved by applying a corruption procedure named DrPerturb to working programs, creating broken-program and fixed-program pairs that resemble real-world scenarios. The self-supervised approach significantly enhances the model's capability by pre-training it with a larger volume of diverse examples before fine-tuning it on a specific task.

Results reveal that self-supervised pre-training boosts repair accuracy across various error types, particularly for errors with insufficient representation in the original dataset. This underscores the importance of leveraging massive amounts of unlabeled data to address data scarcity issues inherent in supervised learning paradigms.

Experimental Results

The paper evaluates its approach on two distinct datasets: the DeepFix dataset, which involves correcting introductory programming assignments, and the SPoC dataset, centered on repairing syntactically incorrect outputs from program synthesis. In both scenarios, the proposed system, DrRepair, markedly outperforms previous models—achieving a 68.2% full repair rate on DeepFix (+22.9% over the prior best) and a 48.4% synthesis success rate on SPoC (+3.7% over the prior best).

Implications and Future Directions

The implications of this research are manifold. Practically, an automated program repair tool that performs with the accuracy demonstrated by DrRepair can significantly enhance developer productivity by efficiently diagnosing and fixing code errors. Theoretically, the integration of graph-based reasoning and self-supervised data synthesis opens a promising avenue for addressing other domains requiring complex interactions between data contexts, such as the learning of semantic analysis in natural language processing or interactive dialogue systems.

Future development could involve refining the corruption procedure to better mimic real-world errors or expanding the graph-based approach to incorporate dynamic program execution data. The combination of high bandwidth feedback and sophisticated representation models, as proposed in this paper, points toward robust solutions for intelligent error correction in diverse applications of AI.

In summary, "Graph-based, Self-Supervised Program Repair from Diagnostic Feedback" presents a well-founded approach that could significantly advance automated program repair, underscoring the benefits of integrating graph neural networks with expansive data-driven learning paradigms. As AI continues to intersect more deeply with coding and software development, the innovations presented herein may lay the groundwork for more intelligent and autonomous programming systems.

PDF Markdown Bookmark Chat (Pro)

Authors (2)

Michihiro Yasunaga (48 papers)
Percy Liang (239 papers)

Citations (169)

View on Semantic Scholar