Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TRIAD: Automated Traceability Recovery based on Biterm-enhanced Deduction of Transitive Links among Artifacts (2312.16854v2)

Published 28 Dec 2023 in cs.SE

Abstract: Traceability allows stakeholders to extract and comprehend the trace links among software artifacts introduced across the software life cycle, to provide significant support for software engineering tasks. Despite its proven benefits, software traceability is challenging to recover and maintain manually. Hence, plenty of approaches for automated traceability have been proposed. Most rely on textual similarities among software artifacts, such as those based on Information Retrieval (IR). However, artifacts in different abstraction levels usually have different textual descriptions, which can greatly hinder the performance of IR-based approaches (e.g., a requirement in natural language may have a small textual similarity to a Java class). In this work, we leverage the consensual biterms and transitive relationships (i.e., inner- and outer-transitive links) based on intermediate artifacts to improve IR-based traceability recovery. We first extract and filter biterms from all source, intermediate, and target artifacts. We then use the consensual biterms from the intermediate artifacts to extend the biterms of both source and target artifacts, and finally deduce outer and inner-transitive links to adjust text similarities between source and target artifacts. We conducted a comprehensive empirical evaluation based on five systems widely used in other literature to show that our approach can outperform four state-of-the-art approaches, and how its performance is affected by different conditions of source, intermediate, and target artifacts. The results indicate that our approach can outperform baseline approaches in AP over 15% and MAP over 10% on average.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. 2023a. Center of Excellence for Software and Systems Traceability. http://www.coest.org/.
  2. 2023b. CoEST community datasets. http://sarec.nd.edu/coest/datasets.html.
  3. 2023. Comet Data Replication Package: LibEST. https://gitlab.com/SEMERU-Code-Public/Data/icse20-comet-data-replication-package/-/tree/main/LibEST.
  4. 2023. Dronology Datasets. https://dronology.info/datasets/.
  5. 2023. srcML. https://www.srcml.org/.
  6. 2023a. TRIAD code. https://github.com/huiAlex/TRIAD.
  7. 2023b. TRIAD dataset. https://doi.org/10.5281/zenodo.10430771.
  8. A Traceability Technique for Specifications. In 16th IEEE International Conference on Program Comprehension. IEEE, 103–112.
  9. Exploiting Parts-of-Speech for effective automated requirements traceability. Inf. Softw. Technol. 106 (2019), 126–141. https://doi.org/10.1016/j.infsof.2018.09.009
  10. Recovering Traceability Links between Code and Documentation. IEEE Trans. Software Eng. 28, 10 (2002), 970–983.
  11. Ricardo Baezayates and Berthier Ribeironeto. 2011. Modern information retrieval. Addison-Wesley Publishing CompanyUnited States.
  12. Robert Bassett and Julio Deride. 2019. Maximum a posteriori estimators as a limit of Bayes estimators. Math. Program. 174, 1-2 (2019), 129–144. https://doi.org/10.1007/S10107-018-1241-0
  13. The Concept Assignment Problem in Program Understanding. In 15th International Conference on Software Engineering, Victor R. Basili, Richard A. DeMillo, and Takuya Katayama (Eds.). IEEE/ACM, 482–498.
  14. BTM: Topic Modeling over Short Texts. IEEE Transactions on Knowledge and Data Engineering 26, 12 (2014), 2928–2941. https://doi.org/10.1109/TKDE.2014.2313872
  15. Elliot J. Chikofsky and James H. Cross II. 1990. Reverse Engineering and Design Recovery: A Taxonomy. IEEE Softw. 7, 1 (1990), 13–17. https://doi.org/10.1109/52.43044
  16. Software traceability: trends and future directions. In Future of Software Engineering. ACM, 55–69.
  17. Utilizing Supporting Evidence to Improve Dynamic Requirements Traceability. In 13th IEEE International Conference on Requirements Engineering. IEEE, 135–144.
  18. Dronology: an incubator for cyber-physical systems research. In 40th International Conference on Software Engineering. ACM, 109–112. https://doi.org/10.1145/3183399.3183408
  19. Information Retrieval Methods for Automated Traceability Recovery. In Software and Systems Traceability, Jane Cleland-Huang, Olly Gotel, and Andrea Zisman (Eds.). Springer, 71–98.
  20. Incremental Approach and User Feedbacks: a Silver Bullet for Traceability Recovery. In 22nd IEEE International Conference on Software Maintenance. IEEE, 299–309.
  21. Improving IR-based Traceability Recovery Using Smoothing Filters. In 19th IEEE International Conference on Program Comprehension. IEEE, 21–30.
  22. Using code ownership to improve IR-based Traceability Link Recovery. In 21st IEEE International Conference on Program Comprehension. IEEE Computer Society, 123–132. https://doi.org/10.1109/ICPC.2013.6613840
  23. Integrating information retrieval, execution and link analysis algorithms to improve feature location in software. Empirical Software Engineering 18, 2 (2013), 277–309.
  24. Semi-supervised pre-processing for learning-based traceability framework on real-world software projects. In 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Abhik Roychoudhury, Cristian Cadar, and Miryung Kim (Eds.). ACM, 570–582. https://doi.org/10.1145/3540250.3549151
  25. Effort and Quality of Recovering Requirements-to-Code Traces: Two Exploratory Experiments. In 18th IEEE International Requirements Engineering Conference. IEEE, 221–230.
  26. Leveraging Historical Associations between Requirements and Source Code to Identify Impacted Classes. IEEE Trans. Software Eng. 46, 4 (2020), 420–441. https://doi.org/10.1109/TSE.2018.2861735
  27. Propagating frugal user feedback through closeness of code dependencies to improve IR-based traceability recovery. Empir. Softw. Eng. 27, 2 (2022), 41. https://doi.org/10.1007/s10664-021-10091-5
  28. Using Consensual Biterms from Text Structures of Requirements and Code to Improve IR-Based Traceability Recovery. In 37th IEEE/ACM International Conference on Automated Software Engineering (ASE ’22). ACM, Article 114. https://doi.org/10.1145/3551349.3556948
  29. On integrating orthogonal information retrieval methods to improve traceability recovery. In IEEE 27th International Conference on Software Maintenance. IEEE Computer Society, 133–142.
  30. Inconsistency management for multiple-view software development environments. IEEE Transactions on Software Engineering 24, 11 (1998), 960–981. https://doi.org/10.1109/32.730545
  31. Semantically enhanced software traceability using deep learning techniques. In 39th International Conference on Software Engineering, Sebastián Uchitel, Alessandro Orso, and Martin P. Robillard (Eds.). IEEE/ACM, 3–14.
  32. Mohammad Abdul Hadi and Fatemeh H Fard. 2020. AOBTM: Adaptive Online Biterm Topic Modeling for Version Sensitive Short-texts Analysis. In IEEE International Conference on Software Maintenance and Evolution. 593–604. https://doi.org/10.1109/ICSME46990.2020.00062
  33. Advancing Candidate Link Generation for Requirements Tracing: The Study of Methods. IEEE Trans. Software Eng. 32, 1 (2006), 4–19.
  34. Improving Traceability Link Recovery Using Fine-grained Requirements-to-Code Relations. In IEEE International Conference on Software Maintenance and Evolution, ICSME 2021, Luxembourg, September 27 - October 1, 2021. IEEE, 12–22. https://doi.org/10.1109/ICSME52107.2021.00008
  35. Matthew D. Hoffman and Andrew Gelman. 2014. The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. 15, 1 (2014), 1593–1623. https://doi.org/10.5555/2627435.2638586
  36. Einar W. Høst and Bjarte M. Østvold. 2009. Debugging Method Names. In 23rd European Conference on Object-Oriented Programming (LNCS, Vol. 5653), Sophia Drossopoulou (Ed.). Springer, 294–317. https://doi.org/10.1007/978-3-642-03013-0_14
  37. I. Ivkovic and K. Kontogiannis. 2004. Tracing evolution changes of software artifacts through model synchronization. In 20th IEEE International Conference on Software Maintenance. 252–261. https://doi.org/10.1109/ICSM.2004.1357809
  38. Using frugal user feedback with closeness analysis on code to improve IR-based traceability recovery. In 27th International Conference on Program Comprehension, Yann-Gaël Guéhéneuc, Foutse Khomh, and Federica Sarro (Eds.). IEEE/ACM, 369–379.
  39. Analyzing closeness of code dependencies for improving IR-based Traceability Recovery. In 24th IEEE International Conference on Software Analysis, Evolution and Reengineering, Martin Pinzger, Gabriele Bavota, and Andrian Marcus (Eds.). IEEE, 68–78.
  40. RCLinker: automated linking of issue reports and commits leveraging rich contextual information. In 23rd IEEE International Conference on Program Comprehension. IEEE, 36–47. https://doi.org/10.1109/ICPC.2015.13
  41. Traceability Transformed: Generating more Accurate Links with Pre-Trained BERT Models. In 43rd IEEE/ACM International Conference on Software Engineering. IEEE, 324–335. https://doi.org/10.1109/ICSE43902.2021.00040
  42. Information Retrieval Methods for Automated Traceability Recovery. In Software and Systems Traceability, Jane Cleland-Huang, Olly Gotel, and Andrea Zisman (Eds.). Springer, 71–98. https://doi.org/10.1007/978-1-4471-2239-5_4
  43. Cliff’s Delta Calculator: A non-parametric effect size program for two groups of observations. Universitas Psychologica 10 (2011), 545–555.
  44. Patrick Mäder and Alexander Egyed. 2015. Do developers benefit from requirements traceability when evolving and maintaining a software system? Empir. Softw. Eng. 20, 2 (2015), 413–441. https://doi.org/10.1007/s10664-014-9314-z
  45. Strategic Traceability for Safety-Critical Projects. IEEE Softw. 30, 3 (2013), 58–66. https://doi.org/10.1109/MS.2013.60
  46. Jonathan I. Maletic and Michael L. Collard. 2015. Exploration, Analysis, and Manipulation of Source Code Using srcML. In 37th IEEE/ACM International Conference on Software Engineering, Vol. 2. 951–952. https://doi.org/10.1109/ICSE.2015.302
  47. The Stanford CoreNLP Natural Language Processing Toolkit. In 52nd Annual Meeting of the Association for Computational Linguistics. ACL, 55–60. https://doi.org/10.3115/v1/p14-5010
  48. Andrian Marcus and Jonathan I. Maletic. 2003. Recovering Documentation-to-Source-Code Traceability Links using Latent Semantic Indexing. In 25th International Conference on Software Engineering, Lori A. Clarke, Laurie Dillon, and Walter F. Tichy (Eds.). IEEE, 125–137.
  49. Supporting Quality Assurance with Automated Process-Centric Quality Constraints Checking. In 43rd IEEE/ACM International Conference on Software Engineering. IEEE, 1298–1310. https://doi.org/10.1109/ICSE43902.2021.00118
  50. Combining textual and structural analysis of software artifacts for traceability link recovery. In ICSE Workshop on Traceability in Emerging Forms of Software Engineering, Giuliano Antoniol, Denys Poshyvanyk, and Rocco Oliveto (Eds.). IEEE, 41–48.
  51. Tracing with Less Data: Active Learning for Classification-Based Traceability Link Recovery. In 2019 IEEE International Conference on Software Maintenance and Evolution. IEEE, 103–113. https://doi.org/10.1109/ICSME.2019.00020
  52. Automatic Traceability Maintenance via Machine Learning Classification. In 2018 IEEE International Conference on Software Maintenance and Evolution. IEEE, 369–380. https://doi.org/10.1109/ICSME.2018.00045
  53. Improving the effectiveness of traceability link recovery using hierarchical bayesian networks. In 42nd International Conference on Software Engineering, Gregg Rothermel and Doo-Hwan Bae (Eds.). ACM, 873–885. https://doi.org/10.1145/3377811.3380418
  54. A SysML-based approach to traceability management and design slicing in support of safety certification: Framework, tool support, and case studies. Inf. Softw. Technol. 54, 6 (2012), 569–590. https://doi.org/10.1016/j.infsof.2012.01.005
  55. On the generation, structure, and semantics of grammar patterns in source code identifiers. J. Syst. Softw. 170 (2020), 110740. https://doi.org/10.1016/j.jss.2020.110740
  56. Managing Security Control Assumptions Using Causal Traceability. In 8th IEEE/ACM International Symposium on Software and Systems Traceability, Patrick Mäder and Rocco Oliveto (Eds.). IEEE Computer Society, 43–49. https://doi.org/10.1109/SST.2015.14
  57. Recovering transitive traceability links among software artifacts. In IEEE International Conference on Software Maintenance and Evolution. IEEE, 576–580. https://doi.org/10.1109/ICSM.2015.7332517
  58. How to effectively use topic models for software engineering tasks? An approach based on Genetic Algorithms. In 35th International Conference on Software Engineering. 522–531. https://doi.org/10.1109/ICSE.2013.6606598
  59. Adaptive User Feedback for IR-Based Traceability Recovery. In 8th IEEE/ACM International Symposium on Software and Systems Traceability, Patrick Mäder and Rocco Oliveto (Eds.). IEEE, 15–21.
  60. When and How Using Structural Information to Improve IR-Based Traceability Recovery. In 17th European Conference on Software Maintenance and Reengineering, Anthony Cleve, Filippo Ricca, and Maura Cerioli (Eds.). IEEE, 199–208.
  61. Martin F. Porter. 1980. An algorithm for suffix stripping. Program 14, 3 (1980), 130–137. https://doi.org/10.1108/eb046814
  62. Feature Location Using Probabilistic Ranking of Methods Based on Execution Scenarios and Information Retrieval. IEEE Trans. Software Eng. 33, 6 (2007), 420–432.
  63. Reactive Links across Multi-Domain Engineering Models. In 25th International Conference on Model Driven Engineering Languages and Systems. ACM, 76–86. https://doi.org/10.1145/3550355.3552446
  64. Balasubramaniam Ramesh and Matthias Jarke. 2001. Toward Reference Models of Requirements Traceability. IEEE Trans. Software Eng. 27, 1 (2001), 58–93. https://doi.org/10.1109/32.895989
  65. Analyzing requirements and traceability information to improve bug localization. In 15th International Conference on Mining Software Repositories, Andy Zaidman, Yasutaka Kamei, and Emily Hill (Eds.). ACM, 442–453. https://doi.org/10.1145/3196398.3196415
  66. Traceability in the Wild: Automatically Augmenting Incomplete Trace Links. In Software Engineering and Software Management (LNI, Vol. P-292). GI, 63.
  67. Patrick Rempel and Patrick Mäder. 2017. Preventing Defects: The Impact of Requirements Traceability Completeness on Software Quality. IEEE Trans. Software Eng. 43, 8 (2017), 777–797.
  68. Leveraging Intermediate Artifacts to Improve Automated Trace Link Retrieval. In IEEE International Conference on Software Maintenance and Evolution. IEEE, 81–92. https://doi.org/10.1109/ICSME52107.2021.00014
  69. Munirathnam Srikanth and Rohini Srihari. 2002. Biterm Language Models for Document Retrieval. In 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 425–426. https://doi.org/10.1145/564376.564476
  70. FRLink: Improving the recovery of missing issue-commit links by revisiting file relevance. Inf. Softw. Technol. 84 (2017), 33–47. https://doi.org/10.1016/j.infsof.2016.11.010
  71. Frank Wilcoxon. 1944. Individual Comparisons by Ranking Methods. Biom Bull. Biometrics 1, 6 (1944), 80–83.
Citations (1)

Summary

We haven't generated a summary for this paper yet.