Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Evaluation and Measurement of Software Process Improvement -- A Systematic Literature Review (2307.13143v1)

Published 24 Jul 2023 in cs.SE

Abstract: BACKGROUND: Software Process Improvement (SPI) is a systematic approach to increase the efficiency and effectiveness of a software development organization and to enhance software products. OBJECTIVE: This paper aims to identify and characterize evaluation strategies and measurements used to assess the impact of different SPI initiatives. METHOD: The systematic literature review includes 148 papers published between 1991 and 2008. The selected papers were classified according to SPI initiative, applied evaluation strategies, and measurement perspectives. Potential confounding factors interfering with the evaluation of the improvement effort were assessed. RESULTS: Seven distinct evaluation strategies were identified, wherein the most common one, "Pre-Post Comparison" was applied in 49 percent of the inspected papers. Quality was the most measured attribute (62 percent), followed by Cost (41 percent), and Schedule (18 percent). Looking at measurement perspectives, "Project" represents the majority with 66 percent. CONCLUSION: The evaluation validity of SPI initiatives is challenged by the scarce consideration of potential confounding factors, particularly given that "Pre-Post Comparison" was identified as the most common evaluation strategy, and the inaccurate descriptions of the evaluation context. Measurements to assess the short and mid-term impact of SPI initiatives prevail, whereas long-term measurements in terms of customer satisfaction and return on investment tend to be less used.

Citations (282)

Summary

  • The paper finds that pre-post comparison is the dominant evaluation strategy, essential for setting improvement baselines.
  • The study identifies quality improvements, noted in 62% of cases, as the primary indicator with cost and schedule following.
  • The review highlights gaps in long-term ROI and confounding factor analyses, suggesting a need for more comprehensive frameworks.

Evaluation and Measurement of Software Process Improvement: A Systematic Literature Review

In the academic discourse on software engineering, process improvement has long been a focal point for both researchers and practitioners. The paper "Evaluation and Measurement of Software Process Improvement - A Systematic Literature Review" by Unterkalmsteiner et al. offers a comprehensive examination of the methods and metrics used to evaluate the outcomes of software process improvement (SPI) initiatives. The paper is grounded in an analysis of 148 papers published between 1991 and 2008 and provides critical insights into the prevalent evaluation strategies and metrics adopters in the field.

The paper reveals that "Pre-Post Comparison" emerges as the most frequently utilized evaluation strategy, employed in 49% of the papers. Interestingly, it highlights the importance of establishing a baseline for measuring improvement outcomes despite challenges in creating suitable baselines when historical data is absent. The predominance of this strategy suggests a significant reliance on before-and-after comparisons, albeit with limited discussion on the potential confounding factors that could obscure the causal link between the initiative and the observed outcomes.

The research identifies various success indicators, with quality (both process and product) as the most measured attribute, reported in 62% of studies. Cost and schedule follow suit with 41% and 18%, respectively. This distribution signifies a strong focus on quality as the primary objective of SPI, reflecting its importance to stakeholders. Additionally, it unveils a disparity in the frequency of different evaluation metrics based on ISO 9126-1 product quality attributes, with reliability being the most measured, indicating a more robust focus on evaluating qualities that directly impact customer satisfaction and product reliability.

The paper emphasizes an apparent gap in measuring long-term impacts, such as return on investment (ROI) and customer satisfaction, which appear less commonly in the reviewed literature. ROI is reported in only 15% of instances, indicating that long-term financial metrics perhaps take a backseat to more immediate, technical improvements. Despite these findings, the paper advocates for integrating such measurements to provide a more holistic view of SPI initiatives' success, aligning technical performance improvements with organizational and financial objectives.

A significant highlight of the paper is the underreported concern for confounding factors when evaluating SPI initiatives. The scant discussion across studies on these factors shows a potential oversight in attributing improvements directly to SPI initiatives without considering external influences. This deficiency underscores the need for more robust methodological frameworks that can account for these influences, ensuring that evaluations accurately reflect the benefits attributed to SPI efforts.

Moreover, the paper suggests that the measurement perspective predominantly favors project-level evaluations (66%), with surprisingly fewer organizational-level assessments, potentially limiting wider organizational learning and improvement. This gap suggests a need for enhanced approaches that ensure alignment between project outcomes and broader strategic objectives.

The implications of this systematic review are profound, both for academia and practice. For researchers, the paper highlights areas where SPI evaluation can be more comprehensive and robust by addressing the long-term impacts and potential confounding factors. Practitioners can leverage these insights to implement SPI evaluations that not only quantify immediate post-initiative outcomes but also align with longer-term strategic goals.

In conclusion, this systematic literature review provides a foundational understanding of current SPI evaluation practices and identifies critical gaps that need filling for more effective organizational learning and improvement. Future research should focus on developing more comprehensive evaluation frameworks that incorporate organizational-, product-, and project-level metrics, alongside detailed confounding factor analyses, to better capture the true impact of SPI initiatives.