Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ARJA: Automated Repair of Java Programs via Multi-Objective Genetic Programming (1712.07804v1)

Published 21 Dec 2017 in cs.SE

Abstract: Recent empirical studies show that the performance of GenProg is not satisfactory, particularly for Java. In this paper, we propose ARJA, a new GP based repair approach for automated repair of Java programs. To be specific, we present a novel lower-granularity patch representation that properly decouples the search subspaces of likely-buggy locations, operation types and potential fix ingredients, enabling GP to explore the search space more effectively. Based on this new representation, we formulate automated program repair as a multi-objective search problem and use NSGA-II to look for simpler repairs. To reduce the computational effort and search space, we introduce a test filtering procedure that can speed up the fitness evaluation of GP and three types of rules that can be applied to avoid unnecessary manipulations of the code. Moreover, we also propose a type matching strategy that can create new potential fix ingredients by exploiting the syntactic patterns of the existing statements. We conduct a large-scale empirical evaluation of ARJA along with its variants on both seeded bugs and real-world bugs in comparison with several state-of-the-art repair approaches. Our results verify the effectiveness and efficiency of the search mechanisms employed in ARJA and also show its superiority over the other approaches. In particular, compared to jGenProg (an implementation of GenProg for Java), an ARJA version fully following the redundancy assumption can generate a test-suite adequate patch for more than twice the number of bugs (from 27 to 59), and a correct patch for nearly four times of the number (from 5 to 18), on 224 real-world bugs considered in Defects4J. Furthermore, ARJA is able to correctly fix several real multi-location bugs that are hard to be repaired by most of the existing repair approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Yuan Yuan (234 papers)
  2. Wolfgang Banzhaf (29 papers)
Citations (190)

Summary

  • The paper introduces ARJA, a framework that leverages multi-objective genetic programming to automatically repair Java programs with simpler patches.
  • It employs innovations like lower-granularity patch representation, test filtering, and scope-based ingredient selection to optimize repair efficiency.
  • Empirical evaluations show ARJA outperforms existing methods by generating semantically correct patches for both seeded and real-world Java bugs.

Automated Repair of Java Programs via Multi-Objective Genetic Programming

The paper "ARJA: Automated Repair of Java Programs via Multi-Objective Genetic Programming" by Yuan Yuan and Wolfgang Banzhaf presents a novel approach to automated program repair that aims to address the challenges of efficiently fixing Java program bugs while minimizing patch complexity. This is significant given the increasing demand for high-quality software and the substantial costs associated with debugging.

Summary of the Approach

The paper introduces ARJA, a system that leverages multi-objective genetic programming (GP) to repair Java programs. The authors identify key limitations in existing tools such as GenProg, including high-granularity patch representation, the complexity of generated repairs, and expensive fitness evaluation. ARJA addresses these with the following innovations:

  1. Patch Representation: ARJA introduces a lower-granularity patch representation that separates likely-buggy locations, operation types, and potential fix ingredients. This representation enables more efficient genetic operations, allowing clearer exploration of the search space.
  2. Multi-Objective Optimization: The approach formulates repair as a multi-objective problem, optimizing for both the weighted failure rate of test cases and the complexity of patches. By employing NSGA-II, a Pareto-efficient solution set is obtained, which prioritizes simpler patches without compromising functionality.
  3. Test Filtering: A novel test-filtering procedure is proposed to speed up fitness evaluation significantly by eliminating irrelevant tests which do not affect potentially buggy lines. This substantially reduces the computational burden during search.
  4. Scope Consideration: The method enhances the selection of fix ingredients by considering both variable and method scopes, thereby increasing the likelihood of generating compilable patches.
  5. Type Matching Strategy: This strategy is used to generate new potential fix ingredients by recognizing patterns in syntactically compatible existing code, addressing the limitation of solely reusing extant code.
  6. Search Space Reduction: Through custom rules incorporated into various stages of the repair process (operation initialization, ingredient screening, and solution decoding), ARJA effectively minimizes unnecessary search space, focusing on meaningful transformations.

Empirical Evaluation

The empirical evaluation encompasses both seeded bugs and real-world Java bugs sourced from Defects4J. With seeded bugs, ARJA demonstrates a clear advantage over both random search and single-objective genetic search, significantly improving success rates and extending its applicability to multi-location bugs. When evaluated on Defects4J, ARJA outperformed several state-of-the-art repair options, generating test-suite adequate patches for 59 bugs, with manual verification identifying 18 as semantically correct.

Implications and Future Work

ARJA signifies a step forward in reducing the cost and complexity of program repairs, demonstrating the potential of GP when enriched with multi-objective optimization and advanced test management. The paper confirms that improvements in the GP framework can leverage the redundancy assumption more effectively to achieve both syntactic and semantic fixes in software repair. Future research may focus on further enhancing the expressive power of GP, incorporating machine learning insights or domain-specific heuristics to improve patch quality and correctness rates. Additionally, empirical evaluations on diverse datasets and exploration of the balance between expressive patch generation and comprehensible patch synthesis promise continued advancements in automated program repair methodologies.