Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 97 tok/s

Gemini 2.5 Pro 44 tok/s Pro

GPT-5 Medium 26 tok/s Pro

GPT-5 High 27 tok/s Pro

GPT-4o 100 tok/s Pro

GPT OSS 120B 464 tok/s Pro

Kimi K2 186 tok/s Pro

2000 character limit reached

A Critical Review of "Automatic Patch Generation Learned from Human-Written Patches": Essay on the Problem Statement and the Evaluation of Automatic Software Repair (1408.2103v1)

Published 9 Aug 2014 in cs.SE

Abstract: At ICSE'2013, there was the first session ever dedicated to automatic program repair. In this session, Kim et al. presented PAR, a novel template-based approach for fixing Java bugs. We strongly disagree with key points of this paper. Our critical review has two goals. First, we aim at explaining why we disagree with Kim and colleagues and why the reasons behind this disagreement are important for research on automatic software repair in general. Second, we aim at contributing to the field with a clarification of the essential ideas behind automatic software repair. In particular we discuss the main evaluation criteria of automatic software repair: understandability, correctness and completeness. We show that depending on how one sets up the repair scenario, the evaluation goals may be contradictory. Eventually, we discuss the nature of fix acceptability and its relation to the notion of software correctness.

Citations (175)

View on Semantic Scholar

Collections

Summary

A Critical Examination of Automatic Patch Generation in Software Repair

The paper entitled "A Critical Review of 'Automatic Patch Generation Learned from Human-Written Patches': Essay on the Problem Statement and the Evaluation of Automatic Software Repair" by Martin Monperrus presents a comprehensive critique of the PAR approach proposed by Kim et al. for automatic software repair. The author challenges several foundational elements and evaluation methods of the PAR system and discusses broader implications for the field of automatic software repair.

Monperrus opens by acknowledging the novelty of automatic program repair and outlines the primary goals of the critique: to identify shortcomings in the PAR system and to propose essential frameworks for evaluation in automatic software repair. The primary focus is on the inconsistency and absence of a clearly defined "defect class" in existing methodologies, and how this adversely affects the conclusiveness of experimental evaluations.

A significant issue raised is the evaluation dataset for PAR and its benchmarking with GenProg. Monperrus points out that without a clear definition of defect classes, comparative evaluations can be misleading, asserting that a principled dataset construction is critical to the validity of empirical studies in the domain. This emphasizes the necessity of characterizing the target defect classes explicitly and ensuring that evaluation datasets are representative of these classes.

Furthermore, the paper explores an important debate on the evaluation criteria of automatic repairs such as understandability, correctness, and completeness. Monperrus cautions against relying solely on human-like patch synthesis and underscores the importance of alternative evaluation metrics that consider the inherent characteristics of automatic processes. He acknowledges the presence of alien code solutions and suggests they represent valuable avenues for software repair approaches, which should not be constrained by human-centric paradigms.

Additionally, Monperrus provides an insightful perspective on the problem statement of automatic software repair, differentiating between state repair and behavioral repair and emphasizing the diverse nature of these problem settings. This differentiation underlines the need for adaptive evaluation strategies tailored to each scenario, including runtime fixes and off-line patch recommendations that might involve interactive human oversight.

The paper further explores the challenging concept of "fix acceptability" through a thought experiment, asserting that certain patch comparisons could be inherently subjective and unanswerable given current limitations in software correctness definitions. Through this lens, Monperrus encourages a re-examination of what constitutes a "good" or "acceptable" fix, advocating for recognition of the complexity and multifaceted nature of the problem.

In conclusion, Monperrus's paper offers a critical perspective on automatic software repair methodologies, encouraging the research community to re-evaluate current practices in defect class definition, evaluation metrics, and problem statements in software repair automation. The emphasis on foundational elements such as defect class clarity and tailored evaluation approaches presents an opportunity for significant advancements in the efficient and meaningful deployment of automatic repair solutions in modern software engineering. This critique serves as a valuable resource for future research endeavors aimed at refining methodologies in this evolving field. The implications of this work are far-reaching, with the potential to shape the future discourse and development of sophisticated software repair systems.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Authors (1)

Martin Monperrus