Learning How to Mutate Source Code from Bug-Fixes (1812.10772v2)

Published 27 Dec 2018 in cs.SE

Abstract: Mutation testing has been widely accepted as an approach to guide test case generation or to assess the effectiveness of test suites. Empirical studies have shown that mutants are representative of real faults; yet they also indicated a clear need for better, possibly customized, mutation operators and strategies. While methods to devise domain-specific or general-purpose mutation operators from real faults exist, they are effort- and error-prone, and do not help the tester to decide whether and how to mutate a given source code element. We propose a novel approach to automatically learn mutants from faults in real programs. First, our approach processes bug fixing changes using fine-grained differencing, code abstraction, and change clustering. Then, it learns mutation models using a deep learning strategy. We have trained and evaluated our technique on a set of ~787k bug fixes mined from GitHub. Our empirical evaluation showed that our models are able to predict mutants that resemble the actual fixed bugs in between 9% and 45% of the cases, and over 98% of the automatically generated mutants are lexically and syntactically correct.

PDF Abstract

Learning How to Mutate Source Code from Bug-Fixes

The paper "Learning How to Mutate Source Code from Bug-Fixes" addresses the challenge of developing effective mutation operators by leveraging deep learning techniques. Traditional mutation testing involves injecting artificial faults into the source code to simulate defects, serving purposes such as guiding test suite generation and assessing test suite effectiveness. While empirical studies have demonstrated that certain mutants can effectively mimic real faults, the creation of tailored mutation operators remains a challenging and error-prone task. This work proposes a novel approach to automatically learn mutation operators from historical bug-fixing activities in software repositories.

Methodology

The approach involves several key steps:

Data Collection: The paper leverages 787,178 bug-fixing commits extracted from GitHub repositories, focusing only on Java projects. The data set includes method-level pairs of buggy and fixed code, referred to as Transformation Pairs (TPs).
Abstraction and Clustering: Abstracted representations of these TPs are generated to reduce vocabulary size and facilitate learning. This includes using a Java lexer and parser to identify elements and replace them with abstracted tokens. Clustering of TPs based on similar sequences of abstract syntax tree (AST) edit actions was performed using doc2vec and k-means clustering.
Model Training: The approach utilizes Recurrent Neural Network (RNN) Encoder-Decoder architectures with attention mechanisms to perform the mutation learning. Different configurations were tested, settling on an optimal network architecture for training on abstracted code representations.

Results

The models trained on this data show promising results:

Performance Metrics: The models achieved BLEU scores significantly higher than baseline, indicating that the generated mutants more closely resembled actual buggy code after mutation.
Prediction Quality: Between 9% and 45% of the generated mutants perfectly matched the original buggy code, depending on the specific mutation model cluster. Additionally, more than 98% of predictions were syntactically correct.
Variation in Mutants: Different mutation models demonstrate varied abilities to generate unique and meaningful mutants, aided by the clustering approach which tailored learning to specific transformation patterns.

Implications and Future Work

The potential implications of these findings are multi-faceted. Practically, the development of a tool implementing these learned mutation practices could substantially improve automated software testing processes by providing a reusable pipeline for generating realistic bug scenarios. Theoretically, this research contributes to the understanding of how machine learning can be applied to software engineering tasks traditionally driven by manual processes or heuristic-based methods.

Future developments could focus on fine-tuning the RNN architecture further, expanding the methodology to other programming languages, and integrating this approach into a comprehensive, automated mutation testing framework. By doing so, researchers and practitioners can leverage a deeper understanding of buggy code to drive improved software quality assurance activities.

Overall, this research reflects an innovative step toward automating the traditionally manual and heuristic-driven task of mutation testing, highlighting the synergy between machine learning applications and software engineering practices.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Michele Tufano (28 papers)
Cody Watson (7 papers)
Gabriele Bavota (60 papers)
Massimiliano Di Penta (31 papers)
Martin White (238 papers)
Denys Poshyvanyk (80 papers)

Citations (69)

View on Semantic Scholar

Learning How to Mutate Source Code from Bug-Fixes (1812.10772v2)