An Empirical Study of Token-based Micro Commits (2405.09165v1)
Abstract: In software development, developers frequently apply maintenance activities to the source code that change a few lines by a single commit. A good understanding of the characteristics of such small changes can support quality assurance approaches (e.g., automated program repair), as it is likely that small changes are addressing deficiencies in other changes; thus, understanding the reasons for creating small changes can help understand the types of errors introduced. Eventually, these reasons and the types of errors can be used to enhance quality assurance approaches for improving code quality. While prior studies used code churns to characterize and investigate the small changes, such a definition has a critical limitation. Specifically, it loses the information of changed tokens in a line. For example, this definition fails to distinguish the following two one-line changes: (1) changing a string literal to fix a displayed message and (2) changing a function call and adding a new parameter. These are definitely maintenance activities, but we deduce that researchers and practitioners are interested in supporting the latter change. To address this limitation, in this paper, we define micro commits, a type of small change based on changed tokens. Our goal is to quantify small changes using changed tokens. Changed tokens allow us to identify small changes more precisely. In fact, this token-level definition can distinguish the above example. We investigate defined micro commits in four OSS projects and understand their characteristics as the first empirical study on token-based micro commits. We find that micro commits mainly replace a single name or literal token, and micro commits are more likely used to fix bugs. Additionally, we propose the use of token-based information to support software engineering approaches in which very small changes significantly affect their effectiveness.
- Iso/iec/ieee international standard for software engineering - software life cycle processes - maintenance. ISO/IEC 14764:2006 (E) IEEE Std 14764-2006 Revision of IEEE Std 1219-1998) pp. 1–58 (2006). DOI 10.1109/IEEESTD.2006.235774
- In: Proceedings of the 16th IEEE International Conference on Program Comprehension (ICPC), pp. 182–191. IEEE (2008)
- Journal of Systems and Software 171, 110,821 (2021)
- In: Proceedings of the IEEE International Conference on Software Maintenance (ICSM2013), pp. 230–239. IEEE Computer Society (2013)
- In: Proceedings of the 6th IEEE International Working Conference on Mining Software Repositories (MSR), pp. 1–10 (2009)
- In: Proceedings of the 28th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER2021), pp. 531–535. IEEE (2021)
- In: Proceedings of the 2013 IEEE International Conference on Software Maintenance, pp. 516–519. IEEE (2013)
- In: Proceedings of the 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), pp. 341–350. IEEE (2015)
- Fleiss, J.L.: Measuring nominal scale agreement among many raters. Psychological Bulletin 76, 378–382 (1971)
- German, D.M.: An empirical study of fine-grained software modifications. Empirical Software Engineering 11, 369–393 (2006)
- Empirical Software Engineering 24(4), 2725–2763 (2019)
- Information and Software Technology 135 (2021)
- Empirical Software Engineering 27, 1–32 (2022)
- In: Proceedings of the 41st International Conference on Software Engineering (ICSE2019), pp. 1211–1221 (2019)
- In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering (ASE), p. III–63–III–71 (2008)
- In: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR), pp. 121–130. IEEE (2013)
- In: Proceedings of the IEEE 17th International Conference on Program Comprehension (ICPC), pp. 30–39. IEEE (2009)
- In: Proceedings of the 2008 International Working Conference on Mining Software Repositories (MSR), pp. 99–108 (2008)
- In: Proceedings of the 16th International Conference on Mining Software Repositories (MSR), pp. 34–45. IEEE (2019)
- In: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 298–309 (2018)
- IEEE Transactions on Software Engineering 39(6), 757–773 (2013)
- In: Proceedings of the 17th International Conference on Mining Software Repositories (MSR), pp. 573–577 (2020)
- In: Proceedings of the 22nd International Conference on Program Comprehension (ICPC), pp. 262–265 (2014)
- Empirical Software Engineering 25(1), 890–939 (2020)
- Empirical Software Engineering 27 (2022)
- IEEE Transactions on Software Engineering 38(1), 54–72 (2012)
- In: Proceedings of the IEEE International Conference on Software Maintenance and Evolution (ICSME), pp. 463–467. IEEE (2016)
- In: Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering, pp. 97–106 (2017)
- In: Proceedings of the International Conference on Software Maintenance and Evolution (ICSME), pp. 275–286 (2018)
- In: Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 31–42 (2019)
- Empirical Software Engineering 20, 176–205 (2015)
- In: Proceedings of the 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories (MSR), pp. 490–493. IEEE (2015)
- IEEE Transactions on Software Engineering 44(5), 412–428 (2018)
- In: Proceedings of the 2013 IEEE International Conference on Software Maintenance, pp. 250–259. IEEE (2013)
- In: Proceedings of the International Conference on Software Maintenance (ICSM), pp. 120–130 (2000)
- Empirical Software Engineering 25, 790–823 (2020)
- IEEE Transactions on Software Engineering 31(6), 511–526 (2005)
- In: Proceedings of the IEEE/ACM 39th International Conference on Software Engineering (ICSE), pp. 746–757. IEEE (2017)
- In: Proceedings of the IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM2009, pp. 99–108. IEEE Computer Society (2009)
- Swanson, E.B.: The dimensions of maintenance. In: Proceedings of the 2nd International Conference on Software Engineering (ICSE), pp. 492–497 (1976)
- Family Medicine 37(5), 360–363 (2005)
- Empirical Software Engineering 28(6), 138 (2023)
- Empirical Software Engineering 26 (2021)
- Information and Software Technology 52(1), 31–51 (2010)
- Journal of Systems and Software 113, 296–308 (2016)
- In: Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, pp. 1–10 (2018)
- Masanari Kondo (7 papers)
- Daniel M. German (28 papers)
- Yasutaka Kamei (19 papers)
- Naoyasu Ubayashi (8 papers)
- Osamu Mizuno (2 papers)