On (Mis)perceptions of testing effectiveness: an empirical study

Published 11 Feb 2024 in cs.SE | (2402.07222v1)

Abstract: A recurring problem in software development is incorrect decision making on the techniques, methods and tools to be used. Mostly, these decisions are based on developers' perceptions about them. A factor influencing people's perceptions is past experience, but it is not the only one. In this research, we aim to discover how well the perceptions of the defect detection effectiveness of different techniques match their real effectiveness in the absence of prior experience. To do this, we conduct an empirical study plus a replication. During the original study, we conduct a controlled experiment with students applying two testing techniques and a code review technique. At the end of the experiment, they take a survey to find out which technique they perceive to be most effective. The results show that participants' perceptions are wrong and that this mismatch is costly in terms of quality. In order to gain further insight into the results, we replicate the controlled experiment and extend the survey to include questions about participants' opinions on the techniques and programs. The results of the replicated study confirm the findings of the original study and suggest that participants' perceptions might be based not on their opinions about complexity or preferences for techniques but on how well they think that they have applied the techniques.

Abstract PDF HTML Upgrade to Chat

References (55)

Citations (2)

View on Semantic Scholar

Summary

The paper reveals a 31 percentage point defect detection gap due to misaligned perceptions of testing techniques among developers.
The study employs a controlled experiment with students using Equivalence Partitioning, Branch Testing, and Code Reading to compare perceived versus actual effectiveness.
The findings advocate for integrating empirical feedback tools and enhanced training to mitigate bias in selecting testing techniques.

An Empirical Study on Developers' Perceptions and the Reality of Testing Techniques

Introduction

The paper "On (Mis)perceptions of testing effectiveness: an empirical study" (2402.07222) addresses a critical issue in software development – the accuracy of developers' perceptions regarding the effectiveness of testing techniques. By examining the extent to which these perceptions align with actual efficacy, the study aims to mitigate potential decision-making errors that compromise software quality. A controlled experiment with students, followed by a replication, is used to investigate how preconceived notions of testing effectiveness impact the application of defect detection techniques.

Methodology

The study employs a controlled experiment involving a cohort of computer science students to avoid bias from prior professional experience. Participants applied two testing techniques (Equivalence Partitioning and Branch Testing) and a code review technique (Code Reading by Stepwise Abstraction) to software artifacts. Their perceived effectiveness was gauged through a survey comparing it against actual effectiveness measured by defect detection rates. The experiment was structured to prevent order and individual variance effects via a crossover design.

Findings

Perceptions and Reality

The results reveal a significant disconnect between perceived and actual technique effectiveness. A striking observation is that 50% of participants possess incorrect perceptions of effectiveness. This misalignment results in a tangible adverse impact on defect detection efficacy, averaging a 31 percentage point reduction. Notably, the study finds no bias towards any specific technique, indicating that perceptions are broadly unreliable and vary among individuals.

Opinions and Bias

Further analyses explored the potential drivers of these misperceptions, focusing on participant opinions. Surprisingly, the technique preferences (EP is preferred) and perceived complexity did not correlate with actual effectiveness. Instead, perceptions seem influenced by participants’ self-assessed performance, underscoring a psychological tendency to equate perceived thorough application with real efficacy. Bias was identifiable for Equivalence Partitioning, a technique favored despite not consistently being the most effective.

Implications and Recommendations

This study highlights critical implications for both novice developers and the broader software engineering community. Developers should be cautious of relying on personal judgment to select testing techniques, as perceptions are not reliable indicators of technique effectiveness. The findings suggest several strategic actions to ameliorate misperceptions: develop tools to provide immediate feedback to developers on technique effectiveness, enhance access to empirical evidence from studies, and further investigate the specific conditions under which different techniques perform optimally.

Conclusion

This research provides foundational evidence that developers' perceptions of testing effectiveness are often misaligned with real performance. This insight has considerable implications for training and technique selection within the industry. By identifying this gap, the study advocates for systematic integration of empirical evidence into development workflows, ultimately enhancing software quality. Future work should aim to refine the profiling of effective testing strategies depending on code characteristics and defect types, enriching the decision-making toolkit available to practitioners.

Markdown