Revisiting and Improving Retrieval-Augmented Deep Assertion Generation (2309.10264v1)
Abstract: Unit testing validates the correctness of the unit under test and has become an essential activity in software development process. A unit test consists of a test prefix that drives the unit under test into a particular state, and a test oracle (e.g., assertion), which specifies the behavior in that state. To reduce manual efforts in conducting unit testing, Yu et al. proposed an integrated approach (integration for short), combining information retrieval (IR) with a deep learning-based approach, to generate assertions for a unit test. Despite promising, there is still a knowledge gap as to why or where integration works or does not work. In this paper, we describe an in-depth analysis of the effectiveness of integration. Our analysis shows that: 1) The overall performance of integration is mainly due to its success in retrieving assertions. 2) integration struggles to understand the semantic differences between the retrieved focal-test (focal-test includes a test prefix and a unit under test) and the input focal-test; 3) integration is limited to specific types of edit operations and cannot handle token addition or deletion. To improve the effectiveness of assertion generation, this paper proposes a novel retrieve-and-edit approach named EditAS. Specifically, EditAS first retrieves a similar focal-test from a pre-defined corpus and treats its assertion as a prototype. Then, EditAS reuses the information in the prototype and edits the prototype automatically. EditAS is more generalizable than integration. We conduct experiments on two large-scale datasets and experimental results demonstrate that EditAS outperforms the state-of-the-art approaches, with an average improvement of 10.00%-87.48% and 3.30%-42.65% in accuracy and BLEU score, respectively.
- A. Hartman, “Is ISSTA research relevant to industry?” in Proceedings of the International Symposium on Software Testing and Analysis, ISSTA 2002, Roma, Italy, July 22-24, 2002. ACM, 2002, pp. 205–206. [Online]. Available: https://doi.org/10.1145/566172.566207
- S. Planning, “The economic impacts of inadequate infrastructure for software testing,” National Institute of Standards and Technology, vol. 1, 2002.
- E. Dinella, G. Ryan, T. Mytkowicz, and S. K. Lahiri, “TOGA: A neural method for test oracle generation,” in 44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022, Pittsburgh, PA, USA, May 25-27, 2022. ACM, 2022, pp. 2130–2141. [Online]. Available: https://doi.org/10.1145/3510003.3510141
- E. Daka and G. Fraser, “A survey on unit testing practices and problems,” in 25th IEEE International Symposium on Software Reliability Engineering, ISSRE 2014, Naples, Italy, November 3-6, 2014. IEEE Computer Society, 2014, pp. 201–211. [Online]. Available: https://doi.org/10.1109/ISSRE.2014.11
- C. Pacheco and M. D. Ernst, “Randoop: feedback-directed random testing for java,” in Companion to the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2007, October 21-25, 2007, Montreal, Quebec, Canada. ACM, 2007, pp. 815–816. [Online]. Available: https://doi.org/10.1145/1297846.1297902
- G. Fraser and A. Arcuri, “Evosuite: automatic test suite generation for object-oriented software,” in SIGSOFT/FSE’11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC’11: 13th European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011. ACM, 2011, pp. 416–419. [Online]. Available: https://doi.org/10.1145/2025113.2025179
- M. M. Almasi, H. Hemmati, G. Fraser, A. Arcuri, and J. Benefelds, “An industrial evaluation of unit test generation: Finding real faults in a financial application,” in 39th IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice Track, ICSE-SEIP 2017, Buenos Aires, Argentina, May 20-28, 2017. IEEE Computer Society, 2017, pp. 263–272. [Online]. Available: https://doi.org/10.1109/ICSE-SEIP.2017.27
- C. Watson, M. Tufano, K. Moran, G. Bavota, and D. Poshyvanyk, “On learning meaningful assert statements for unit test cases,” in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, ser. ICSE ’20. New York, NY, USA: Association for Computing Machinery, 2020, p. 1398–1409. [Online]. Available: https://doi.org/10.1145/3377811.3380429
- H. Yu, Y. Lou, K. Sun, D. Ran, T. Xie, D. Hao, Y. Li, G. Li, and Q. Wang, “Automated assertion generation via information retrieval and its integration with deep learning,” in Proceedings of the 44th International Conference on Software Engineering, ser. ICSE ’22. New York, NY, USA: Association for Computing Machinery, 2022, p. 163–174. [Online]. Available: https://doi.org/10.1145/3510003.3510149
- X. Gu, H. Zhang, D. Zhang, and S. Kim, “Deep api learning,” in Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, 2016, pp. 631–642.
- X. Gu, H. Zhang, and S. Kim, “Deep code search,” in Proceedings of the 40th International Conference on Software Engineering, 2018, pp. 933–944.
- Z. Chen, S. Kommrusch, M. Tufano, L. Pouchet, D. Poshyvanyk, and M. Monperrus, “Sequencer: Sequence-to-sequence learning for end-to-end program repair,” IEEE Trans. Software Eng., vol. 47, no. 9, pp. 1943–1959, 2021. [Online]. Available: https://doi.org/10.1109/TSE.2019.2940179
- H. Hata, E. Shihab, and G. Neubig, “Learning to generate corrective patches using neural machine translation,” CoRR, vol. abs/1812.07170, 2018. [Online]. Available: http://arxiv.org/abs/1812.07170
- A. Mesbah, A. Rice, E. Johnston, N. Glorioso, and E. Aftandilian, “Deepdelta: learning to repair compilation errors,” in Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26-30, 2019. ACM, 2019, pp. 925–936. [Online]. Available: https://doi.org/10.1145/3338906.3340455
- M. Tufano, C. Watson, G. Bavota, M. D. Penta, M. White, and D. Poshyvanyk, “An empirical investigation into learning bug-fixing patches in the wild via neural machine translation,” in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Montpellier, France, September 3-7, 2018. ACM, 2018, pp. 832–837. [Online]. Available: https://doi.org/10.1145/3238147.3240732
- Q. Zhang, C. Fang, Y. Ma, W. Sun, and Z. Chen, “A survey of learning-based automated program repair,” CoRR, vol. abs/2301.03270, 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2301.03270
- Y. Lou, Q. Zhu, J. Dong, X. Li, Z. Sun, D. Hao, L. Zhang, and L. Zhang, “Boosting coverage-based fault localization via graph-based representation learning,” in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2021. New York, NY, USA: Association for Computing Machinery, 2021, p. 664–676. [Online]. Available: https://doi.org/10.1145/3468264.3468580
- X. Hu, G. Li, X. Xia, D. Lo, and Z. Jin, “Deep code comment generation,” in Proceedings of the 26th Conference on Program Comprehension, ICPC 2018, Gothenburg, Sweden, May 27-28, 2018. ACM, 2018, pp. 200–210. [Online]. Available: https://doi.org/10.1145/3196321.3196334
- B. Li, M. Yan, X. Xia, X. Hu, G. Li, and D. Lo, “Deepcommenter: a deep code comment generation tool with hybrid lexical and syntactical information,” in ESEC/FSE ’20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Virtual Event, USA, November 8-13, 2020. ACM, 2020, pp. 1571–1575. [Online]. Available: https://doi.org/10.1145/3368089.3417926
- X. Hu, G. Li, X. Xia, D. Lo, S. Lu, and Z. Jin, “Summarizing source code with transferred API knowledge,” in Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden. ijcai.org, 2018, pp. 2269–2275. [Online]. Available: https://doi.org/10.24963/ijcai.2018/314
- J. Zhang, X. Wang, H. Zhang, H. Sun, and X. Liu, “Retrieval-based neural source code summarization,” in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, pp. 1385–1397.
- W. Wang, G. Li, B. Ma, X. Xia, and Z. Jin, “Detecting code clones with graph neural network and flow-augmented abstract syntax tree,” in 27th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2020, London, ON, Canada, February 18-21, 2020. IEEE, 2020, pp. 261–271. [Online]. Available: https://doi.org/10.1109/SANER48275.2020.9054857
- H. Wei and M. Li, “Supervised deep features for software functional clone detection by exploiting lexical and syntactical information in source code,” in Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017. ijcai.org, 2017, pp. 3034–3040. [Online]. Available: https://doi.org/10.24963/ijcai.2017/423
- H. Yu, W. Lam, L. Chen, G. Li, T. Xie, and Q. Wang, “Neural detection of semantic code clones via tree-based convolution,” in Proceedings of the 27th International Conference on Program Comprehension, ICPC 2019, Montreal, QC, Canada, May 25-31, 2019, Y. Guéhéneuc, F. Khomh, and F. Sarro, Eds. IEEE / ACM, 2019, pp. 70–80. [Online]. Available: https://doi.org/10.1109/ICPC.2019.00021
- J. Zhang, X. Wang, H. Zhang, H. Sun, K. Wang, and X. Liu, “A novel neural source code representation based on abstract syntax tree,” in Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019. IEEE / ACM, 2019, pp. 783–794. [Online]. Available: https://doi.org/10.1109/ICSE.2019.00086
- G. Zhao and J. Huang, “Deepsim: deep learning code functional similarity,” in Proceedings of the 2018 ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2018, Lake Buena Vista, FL, USA, November 04-09, 2018. ACM, 2018, pp. 141–151. [Online]. Available: https://doi.org/10.1145/3236024.3236068
- M. Tufano, D. Drain, A. Svyatkovskiy, and N. Sundaresan, “Generating accurate assert statements for unit test cases using pretrained transformers,” in IEEE/ACM International Conference on Automation of Software Test, AST@ICSE 2022, Pittsburgh, PA, USA, May 21-22, 2022. ACM/IEEE, 2022, pp. 54–64. [Online]. Available: https://doi.org/10.1145/3524481.3527220
- A. Mastropaolo, S. Scalabrino, N. Cooper, D. Nader-Palacio, D. Poshyvanyk, R. Oliveto, and G. Bavota, “Studying the usage of text-to-text transfer transformer to support code-related tasks,” in 43rd IEEE/ACM International Conference on Software Engineering, ICSE 2021, Madrid, Spain, 22-30 May 2021. IEEE, 2021, pp. 336–347. [Online]. Available: https://doi.org/10.1109/ICSE43902.2021.00041
- A. Mastropaolo, N. Cooper, D. Nader-Palacio, S. Scalabrino, D. Poshyvanyk, R. Oliveto, and G. Bavota, “Using transfer learning for code-related tasks,” IEEE Trans. Software Eng., vol. 49, no. 4, pp. 1580–1598, 2023. [Online]. Available: https://doi.org/10.1109/TSE.2022.3183297
- “ToGA Artifact.” https://github.com/microsoft/toga, 2022.
- “A issue bug of ToGA Artifact.” https://github.com/microsoft/toga/issues/3, 2022.
- Z. Liu, X. Xia, A. E. Hassan, D. Lo, Z. Xing, and X. Wang, “Neural-machine-translation-based commit message generation: how far are we?” in Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018, Montpellier, France, September 3-7, 2018. ACM, 2018, pp. 373–384. [Online]. Available: https://doi.org/10.1145/3238147.3238190
- “Integration Artifact.” https://github.com/yh1105/Artifact-of-Assertion-ICSE22, 2022.
- S. Merity, N. S. Keskar, and R. Socher, “Regularizing and optimizing LSTM language models,” in 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. [Online]. Available: https://openreview.net/forum?id=SyyGPP0TZ
- W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural network regularization,” CoRR, vol. abs/1409.2329, 2014. [Online]. Available: http://arxiv.org/abs/1409.2329
- C. Thunes, “Javalang.” https://github.com/c2nes/javalang, 2019.
- J. Li, Y. Li, G. Li, X. Hu, X. Xia, and Z. Jin, “Editsum: A retrieve-and-edit framework for source code summarization,” in 36th IEEE/ACM International Conference on Automated Software Engineering, ASE 2021, Melbourne, Australia, November 15-19, 2021. IEEE, 2021, pp. 155–166. [Online]. Available: https://doi.org/10.1109/ASE51524.2021.9678724
- Z. Liu, X. Xia, M. Yan, and S. Li, “Automating just-in-time comment updating,” in 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21-25, 2020. IEEE, 2020, pp. 585–597. [Online]. Available: https://doi.org/10.1145/3324884.3416581
- Z. Liu, X. Xia, D. Lo, M. Yan, and S. Li, “Just-in-time obsolete comment detection and update,” IEEE Trans. Software Eng., vol. 49, no. 1, pp. 1–23, 2023. [Online]. Available: https://doi.org/10.1109/TSE.2021.3138909
- E. Grave, P. Bojanowski, P. Gupta, A. Joulin, and T. Mikolov, “Learning word vectors for 157 languages,” in Proceedings of the Eleventh International Conference on Language Resources and Evaluation, LREC 2018, Miyazaki, Japan, May 7-12, 2018. European Language Resources Association (ELRA), 2018. [Online]. Available: http://www.lrec-conf.org/proceedings/lrec2018/summaries/627.html
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015. The Association for Computational Linguistics, 2015, pp. 1412–1421. [Online]. Available: https://doi.org/10.18653/v1/d15-1166
- “Word vectors for 157 languages.” https://fasttext.cc/docs/en/crawl-vectors.html, 2023.
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [Online]. Available: http://arxiv.org/abs/1412.6980
- N. Srivastava, G. E. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958, 2014. [Online]. Available: https://dl.acm.org/doi/10.5555/2627435.2670313
- https://pytorch.org/.
- L. R. Dice, “Measures of the amount of ecologic association between species,” Ecology, vol. 26, no. 3, pp. 297–302, 1945.
- W. contributors., “Overlap—Wikipedia.” https://en.wikipedia.org/w/index.php?title=Overlap&oldid=1061948530, 2021.
- F. Hassan and X. Wang, “Hirebuild: an automatic approach to history-driven repair of build scripts,” in Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018, M. Chaudron, I. Crnkovic, M. Chechik, and M. Harman, Eds. ACM, 2018, pp. 1078–1089. [Online]. Available: https://doi.org/10.1145/3180155.3180181
- Y. Lou, Z. Chen, Y. Cao, D. Hao, and L. Zhang, “Understanding build issue resolution in practice: symptoms and fix patterns,” in ESEC/FSE ’20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Virtual Event, USA, November 8-13, 2020, P. Devanbu, M. B. Cohen, and T. Zimmermann, Eds. ACM, 2020, pp. 617–628. [Online]. Available: https://doi.org/10.1145/3368089.3409760