Evolution of statistical analysis in empirical software engineering research: Current state and steps forward (1706.00933v7)

Published 3 Jun 2017 in cs.SE

Abstract: Software engineering research is evolving and papers are increasingly based on empirical data from a multitude of sources, using statistical tests to determine if and to what degree empirical evidence supports their hypotheses. To investigate the practices and trends of statistical analysis in empirical software engineering (ESE), this paper presents a review of a large pool of papers from top-ranked software engineering journals. First, we manually reviewed 161 papers and in the second phase of our method, we conducted a more extensive semi-automatic classification of papers spanning the years 2001--2015 and 5,196 papers. Results from both review steps was used to: i) identify and analyze the predominant practices in ESE (e.g., using t-test or ANOVA), as well as relevant trends in usage of specific statistical methods (e.g., nonparametric tests and effect size measures) and, ii) develop a conceptual model for a statistical analysis workflow with suggestions on how to apply different statistical methods as well as guidelines to avoid pitfalls. Lastly, we confirm existing claims that current ESE practices lack a standard to report practical significance of results. We illustrate how practical significance can be discussed in terms of both the statistical analysis and in the practitioner's context.

Citations (49)

View on Semantic Scholar

Summary

The paper reviews statistical analysis in ESE research, identifying dominant methods and trends, and proposes a conceptual model to enhance statistical analysis workflow.
The study highlights a deficiency in standardizing practical significance reporting in ESE research, recommending consideration of both statistical rigor and practitioner context.
Researchers should adopt a holistic statistical approach, integrating tests and context to fully express empirical validity, as recommended by the paper's conceptual model.

Statistical Analysis Practices in Empirical Software Engineering

The paper entitled "Evolution of Statistical Analysis in Empirical Software Engineering Research: Current State and Steps Forward" presents a thorough investigation into the practices and trends of statistical analysis within the field of empirical software engineering (ESE). The authors conduct a detailed review of scholarly works, analyzing both prevalent methodologies and emerging patterns across a substantial corpus of research publications. By dissecting 161 papers in an initial manual review and subsequently expanding the scope to a semi-automated classification covering 5,196 papers, the paper spans from 2001 to 2015, gleaning insights on statistical applications in ESE.

A key contribution of this paper lies in identifying the dominant statistical practices in ESE, such as $t$ -tests and ANOVA, alongside trends in the utilization of nonparametric tests and effect size measures. Moreover, the authors develop a conceptual model aiming to enhance the statistical analysis workflow, providing structured guidance for the implementation of various statistical methods while highlighting common pitfalls. This model serves not only as an analytical framework but also offers practical recommendations to researchers for enhancing the robustness and interpretability of their empirical studies.

One of the notable claims made by the authors is the deficiency of standardization in reporting practical significance in ESE research. Despite the frequent application of statistical tests, the discussion around practical significance is often neglected, leading to a gap between statistical outcomes and their relevance to real-world software engineering scenarios. The paper advocates for a dual consideration of practical significance: one that involves statistical rigor and another that factors in the practitioner's context—thus bridging the theoretical and practical facets of ESE research.

The implications of this paper are multifaceted. Practically, it urges researchers in the domain to adopt a more holistic approach to statistical analysis—one that integrates both statistical tests and contextual understanding to express empirical validity fully. Theoretically, the adoption of the conceptual model and the refined workflow could lead to more standardized practices in future software engineering research, enhancing collaboration and comparison across studies. This could catalyze the development of AI-driven tools for automated and context-aware statistical analysis, aligning with the incremental methodological advancements in empirical research.

In conclusion, while the paper does not make claims of introducing novel statistical techniques, its comprehensive review and structured model provide valuable insights and actionable guidelines for empirical researchers in software engineering. Potential future work could explore deeper integration of AI methodologies to further automate the classification and analysis processes in ESE, thereby refining the precision and applicability of statistical evaluations in this evolving field.

Related Papers

YouTube

Show All Videos