Does Using Bazel Help Speed Up Continuous Integration Builds? (2405.00796v1)
Abstract: A long continuous integration (CI) build forces developers to wait for CI feedback before starting subsequent development activities, leading to time wasted. In addition to a variety of build scheduling and test selection heuristics studied in the past, new artifact-based build technologies like Bazel have built-in support for advanced performance optimizations such as parallel build and incremental build (caching of build results). However, little is known about the extent to which new build technologies like Bazel deliver on their promised benefits, especially for long-build duration projects. In this study, we collected 383 Bazel projects from GitHub, then studied their parallel and incremental build usage of Bazel in 4 popular CI services, and compared the results with Maven projects. We conducted 3,500 experiments on 383 Bazel projects and analyzed the build logs of a subset of 70 buildable projects to evaluate the performance impact of Bazel's parallel builds. Additionally, we performed 102,232 experiments on the 70 buildable projects' last 100 commits to evaluate Bazel's incremental build performance. Our results show that 31.23% of Bazel projects adopt a CI service but do not use Bazel in the CI service, while for those who do use Bazel in CI, 27.76% of them use other tools to facilitate Bazel's execution. Compared to sequential builds, the median speedups for long-build duration projects are 2.00x, 3.84x, 7.36x, and 12.80x, at parallelism degrees 2, 4, 8, and 16, respectively, even though, compared to a clean build, applying incremental build achieves a median speedup of 4.22x (with a build system tool-independent CI cache) and 4.71x (with a build system tool-specific cache) for long-build duration projects. Our results provide guidance for developers to improve the usage of Bazel in their projects.
- In: 2016 IEEE 23rd international conference on software analysis, evolution, and reengineering (SANER), vol. 5, pp. 78–90. IEEE (2016)
- In: Proc. of the International Conference on Software Engineering (ICSE), p. To appear (2024)
- Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18-20, 1967, spring joint computer conference, pp. 483–485 (1967)
- Lawrence Livermore National Laboratory 6(13), 10 (2010)
- Journal of Systems and Software 177, 110,939 (2021)
- Bazel: Bazel documentation (2023). URL https://bazel.build/docs
- Bazel-Remote: A remote cache for bazel (2023). URL https://github.com/buchgr/bazel-remote
- IEEE Transactions on Software Engineering 48(8), 2784–2801 (2021)
- Empirical Software Engineering 28(4), 97 (2023). DOI 10.1007/s10664-023-10327-6
- In: 2010 IEEE 18th International Conference on Program Comprehension, pp. 124–133. IEEE (2010)
- Empirical Software Engineering 22, 3117–3148 (2017)
- In: 2009 6th IEEE International Working Conference on Mining Software Repositories, pp. 1–10. IEEE (2009)
- PeerJ Computer Science 6, e247 (2020)
- Nature communications 10(1), 1017 (2019)
- Buck: Buck: A fast build tool (2023). URL https://buck.build/
- In: 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 838–848 (2017). DOI 10.1109/ASE.2017.8115695
- IEEE software 32(2), 50–54 (2015)
- Journal of Systems and Software 110, 28–53 (2015)
- CircleCI: Circleci documentation (2023). URL https://circleci.com/docs/
- Cliff, N.: Ordinal methods for behavioral data analysis. Psychology Press (2014)
- In: International Conference on Fundamental Approaches to Software Engineering, pp. 96–110. Springer (2005)
- Dunn, O.J.: Multiple comparisons using rank sums. Technometrics 6(3), 241–252 (1964)
- Pearson Education (2007)
- ACM Sigplan Notices 50(10), 89–106 (2015)
- In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 463–474 (2020)
- In: 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 467–470. IEEE (2019)
- Empirical Software Engineering 24, 2102–2139 (2019)
- Ghaleb, T.M.: Studying the unfulfilled promises of continuous integration. Ph.D. thesis, Queen’s University (Canada) (2021)
- GitHub: Github actions documentation (2023). URL https://docs.github.com/en/actions
- In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 662–672. IEEE (2022)
- In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pp. 197–207 (2017)
- In: Proceedings of the 31st IEEE/ACM international conference on automated software engineering, pp. 426–437 (2016)
- Pearson Education (2010)
- In: 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), pp. 457–464. IEEE (2019)
- In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pp. 13–25 (2020)
- Kamath, D.M.: Pragmatic approaches to schedule less builds in ci. Master’s thesis, Queen’s University (Canada) (2023)
- Journal of the American statistical Association 47(260), 583–621 (1952)
- IEEE Software 39(2), 62–70 (2021)
- In: 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pp. 1234–1244. IEEE (2019)
- Linguist: github/linguist: Language savant. if your repository’s language is being reported incorrectly, send us a pull request! (2023). URL https://github.com/github/linguist
- Empirical Software Engineering 26, 1–53 (2021)
- Empirical Software Engineering 27(3), 65 (2022)
- In: 2017 43rd Euromicro Conference on Software Engineering and Advanced Applications (SEAA), pp. 1–9. IEEE (2017)
- In: Proc. 10th Seminar Series Advanced Techniques & Tools for Software Evolution, pp. 1–6 (2017)
- Maven: Apache maven documentation (2023). URL https://maven.apache.org/index.html
- Empirical Software Engineering 17, 578–608 (2012)
- Automated Software Engineering 23, 619–647 (2016)
- In: Proceedings of the 33rd international conference on software engineering, pp. 141–150 (2011)
- Empirical Software Engineering 20, 1587–1633 (2015)
- In: 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track (ICSE-SEIP), pp. 233–242. IEEE (2017)
- ACM Transactions on Software Engineering and Methodology 33(3), 1–35 (2024)
- Proceedings of the ACM on Programming Languages 2(ICFP), 1–29 (2018)
- In: 2012 third international workshop on managing technical debt (MTD), pp. 1–6. IEEE (2012)
- Empirical Software Engineering 27(2), 29 (2022)
- Pants: Pants (2023). URL https://www.pantsbuild.org/
- In: 2021 20th International Symposium INFOTEH-JAHORINA (INFOTEH), pp. 1–5. IEEE (2021)
- In: Proceedings of the 44th International Conference on Software Engineering, pp. 1584–1596 (2022)
- In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pp. 345–355. IEEE (2017)
- In: Proceedings of the 2006 international workshop on Mining software repositories, pp. 3–9 (2006)
- Springer (2019)
- Sourcegraph: Sourcegraph (2023). URL https://sourcegraph.com/search
- Starlark: bazelbuild/starlark: Starlark language (2023). URL https://github.com/bazelbuild/starlark
- In: 2012 28th IEEE International Conference on Software Maintenance (ICSM), pp. 160–169. IEEE (2012)
- In: ECOOP 2011–Object-Oriented Programming: 25th European Conference, Lancaster, Uk, July 25-29, 2011 Proceedings 25, pp. 204–228. Springer (2011)
- TravisCI: Travis ci documentation (2023). URL https://docs.travis-ci.com/
- In: 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, vol. 1, pp. 123–133. IEEE (2015)
- In: Proceedings of the 2015 10th joint meeting on foundations of software engineering, pp. 805–816 (2015)
- In: 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), pp. 160–169. IEEE (2021)
- John Wiley & Sons (2005)
- In: 2017 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C), pp. 311–315. IEEE (2017)
- Empirical Software Engineering 25, 1095–1135 (2020)
- In: 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), pp. 312–322. IEEE (2017)