Continuous Integration and Software Quality: A Causal Explanatory Study (2309.10205v1)
Abstract: Continuous Integration (CI) is a software engineering practice that aims to reduce the cost and risk of code integration among teams. Recent empirical studies have confirmed associations between CI and the software quality (SQ). However, no existing study investigates causal relationships between CI and SQ. This paper investigates it by applying the causal Direct Acyclic Graphs (DAGs) technique. We combine two other strategies to support this technique: a literature review and a Mining Software Repository (MSR) study. In the first stage, we review the literature to discover existing associations between CI and SQ, which help us create a "literature-based causal DAG" in the second stage. This DAG encapsulates the literature assumptions regarding CI and its influence on SQ. In the third stage, we analyze 12 activity months for 70 opensource projects by mining software repositories -- 35 CI and 35 no-CI projects. This MSR study is not a typical "correlation is not causation" study because it is used to verify the relationships uncovered in the causal DAG produced in the first stages. The fourth stage consists of testing the statistical implications from the "literature-based causal DAG" on our dataset. Finally, in the fifth stage, we build a DAG with observations from the literature and the dataset, the "literature-data DAG". In addition to the direct causal effect of CI on SQ, we find evidence of indirect effects of CI. For example, CI affects teams' communication, which positively impacts SQ. We also highlight the confounding effect of project age.
- A. Cairo, G. Carneiro, and M. Monteiro, “The impact of code smells on software bugs: A systematic literature review”, Information, vol. 9, no. 11, p. 273, 2018.
- A. Debbiche, M. Dienér, and R. Berntsson Svensson, “Challenges when adopting continuous integration: A case study,” Product-Focused Software Process Improvement, pp. 17–32, 2014.
- A. Hindle, D. M. German, and R. Holt, “What do large commits tell us?,” Proceedings of the 2008 international workshop on Mining software repositories - MSR ’08, 2008.
- A. Murgia, G. Concas, R. Tonelli, M. Ortu, S. Demeyer, and M. Marchesi, “On the influence of maintenance activity types on the issue resolution time,” Proceedings of the 10th International Conference on Predictive Models in Software Engineering, 2014.
- A. Rahman, A. Agrawal, R. Krishna, and A. Sobran, “Characterizing the influence of continuous integration: Empirical results from 250+ open source and proprietary projects,” Proceedings of the 4th ACM SIGSOFT International Workshop on Software Analytics, 2018.
- A. Zaidman, B. Van Rompaey, S. Demeyer, and A. van Deursen, “Mining software repositories to study co-evolution of Production & Test code,” 2008 1st International Conference on Software Testing, Verification, and Validation, 2008.
- A. Zaidman, B. Van Rompaey, A. van Deursen, and S. Demeyer, “Studying the co-evolution of production and test code in open source and industrial developer test processes through Repository Mining,” Empirical Software Engineering, vol. 16, no. 3, pp. 325–364, 2010.
- C. Amrit and Y. Meijberg, “Effectiveness of test-driven development and continuous integration: A case study,” IT Professional, vol. 20, no. 1, pp. 27–35, 2018.
- C. Heinze-Deml, J. Peters, and N. Meinshausen, “Invariant causal prediction for nonlinear models,” arXiv.org, 19-Sep-2018. [Online]. Available: https://arxiv.org/abs/1706.08576v2. [Accessed: 06-Apr-2022].
- C. Vassallo, S. Proksch, T. Zemp, and H. C. Gall, “Every build you break: Developer-oriented assistance for build failure resolution,” Empirical Software Engineering, vol. 25, no. 3, pp. 2218–2257, 2019.
- “CI Theatre: Technology Radar,” Thoughtworks. [Online]. Available: https://www.thoughtworks.com/radar/techniques/ci-theatre. [Accessed: 22-Mar-2022].
- “Continuous integration,” martinfowler.com. [Online]. Available: https://martinfowler.com/articles/continuousIntegration.html. [Accessed: 22-Mar-2022].
- D. Bijlsma, M. A. Ferreira, B. Luijten, and J. Visser, “Faster issue resolution with higher technical quality of software,” Software Quality Journal, vol. 20, no. 2, pp. 265–285, 2011.
- D. Ståhl and J. Bosch, “Modeling continuous integration practice differences in industry software development,” Journal of Systems and Software, vol. 87, pp. 48–59, 2014.
- Dudekula Mohammad Rafi, Katam Reddy Kiran Moses, K. Petersen, and M. V. Mantyla, “Benefits and limitations of Automated Software Testing: Systematic Literature Review and practitioner survey,” 2012 7th International Workshop on Automation of Software Test (AST), 2012.
- E. Soares, G. Sizilio, J. Santos, D. Alencar da Costa, U. Kulesza, “The effects of continuous integration on software development: a systematic literature review”, Empir Software Eng 27, 78 (2022). https://doi.org/10.1007/s10664-021-10114-1
- E. Laukkanen, J. Itkonen, and C. Lassenius, “Problems, causes and solutions when adopting continuous delivery—a systematic literature review,” Information and Software Technology, vol. 82, pp. 55–79, 2017.
- F. Huang, B. Liu, and B. Huang, “A taxonomy system to identify human error causes for software defects”. In The 18th international conference on reliability and quality in design, p. 44, 2012.
- F. Zhang, F. Khomh, Y. Zou, and A. E. Hassan, “An empirical study on factors impacting bug fixing time,” 2012 19th Working Conference on Reverse Engineering, 2012.
- Gefen, Karahanna, and Straub, “Trust and tam in online shopping: An integrated model,” MIS Quarterly, vol. 27, no. 1, p. 51, 2003.
- G. Pinto, M. Reboucas, and F. Castor, “Inadequate testing, time pressure, and (over) confidence: A tale of continuous integration users,” 2017 IEEE/ACM 10th International Workshop on Cooperative and Human Aspects of Software Engineering (CHASE), 2017.
- G. Pinto, F. Castor, R. Bonifacio, and M. Rebouças, “Work practices and challenges in continuous integration: A survey with Travis CI Users,” Software: Practice and Experience, vol. 48, no. 12, pp. 2223–2236, 2018.
- G. Shmueli, “To explain or to predict?,” SSRN Electronic Journal, 2010.
- G. Sizilio Nery, D. Alencar da Costa, and U. Kulesza, “An empirical study of the relationship between continuous integration and test code evolution,” 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), 2019.
- H. Seo, C. Sadowski, S. Elbaum, E. Aftandilian, and R. Bowdidge, “Programmers’ build errors: A case study (at Google),” Proceedings of the 36th International Conference on Software Engineering, 2014.
- I. Keskin Kaynak, E. Çilden, and S. Aydin, “Software quality improvement practices in continuous integration,” Communications in Computer and Information Science, pp. 507–517, 2019.
- J. H. Bernardo, D. A. da Costa, and U. Kulesza, “Studying the impact of adopting continuous integration on the delivery time of pull requests,” Proceedings of the 15th International Conference on Mining Software Repositories, 2018.
- J. Pearl and T. S. Verma, “A theory of inferred causation,” Logic, Methodology and Philosophy of Science IX, Proceedings of the Ninth International Congress of Logic, Methodology and Philosophy of Science, pp. 789–811, 1995.
- J. Pearl, “Causal diagrams for empirical research,” Biometrika, vol. 82, no. 4, pp. 702–710, 1995.
- J. Textor, B. van der Zander, M. S. Gilthorpe, M. Liśkiewicz, and G. T. H. Ellison, “Robust causal inference using directed acyclic graphs: The R package ‘dagitty,”’ International Journal of Epidemiology, 2017.
- L. D. Panjer, “Predicting eclipse bug lifetimes,” Fourth International Workshop on Mining Software Repositories (MSR’07:ICSE Workshops 2007), 2007.
- M. C. de Oliveira, “Draco: Discovering refactorings that improve architecture using fine-grained co-change dependencies,” Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017.
- M. C. de Oliveira, D. Freitas, R. Bonifácio, G. Pinto, and D. Lo, “Finding needles in a haystack: Leveraging co-change dependencies to recommend Refactorings,” Journal of Systems and Software, vol. 158, p. 110420, 2019.
- M. Hilton, T. Tunnell, K. Huang, D. Marinov, and D. Dig, “Usage, costs, and benefits of continuous integration in open-source projects,” Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, 2016.
- M. Hilton, J. Bell, and D. Marinov, “A large-scale study of test coverage evolution,” Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018.
- Mockus and Votta, “Identifying reasons for software changes using historic databases,” Proceedings International Conference on Software Maintenance ICSM-94, 2000.
- M. R. Islam and M. F. Zibran, “Insights into continuous integration build failures,” 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), 2017.
- M. Reboucas, R. O. Santos, G. Pinto, and F. Castor, “How does contributors’ involvement influence the build status of an open-source software project?,” 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), 2017.
- Eliezio Soares (2 papers)
- Daniel Alencar da Costa (10 papers)
- Uirá Kulesza (15 papers)