Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PreciseBugCollector: Extensible, Executable and Precise Bug-fix Collection (2309.06229v4)

Published 12 Sep 2023 in cs.SE and cs.PL

Abstract: Bug datasets are vital for enabling deep learning techniques to address software maintenance tasks related to bugs. However, existing bug datasets suffer from precise and scale limitations: they are either small-scale but precise with manual validation or large-scale but imprecise with simple commit message processing. In this paper, we introduce PreciseBugCollector, a precise, multi-language bug collection approach that overcomes these two limitations. PreciseBugCollector is based on two novel components: a) A bug tracker to map the codebase repositories with external bug repositories to trace bug type information, and b) A bug injector to generate project-specific bugs by injecting noise into the correct codebases and then executing them against their test suites to obtain test failure messages. We implement PreciseBugCollector against three sources: 1) A bug tracker that links to the national vulnerability data set (NVD) to collect general-wise vulnerabilities, 2) A bug tracker that links to OSS-Fuzz to collect general-wise bugs, and 3) A bug injector based on 16 injection rules to generate project-wise bugs. To date, PreciseBugCollector comprises 1057818 bugs extracted from 2968 open-source projects. Of these, 12602 bugs are sourced from bug repositories (NVD and OSS-Fuzz), while the remaining 1045216 project-specific bugs are generated by the bug injector. Considering the challenge objectives, we argue that a bug injection approach is highly valuable for the industrial setting, since project-specific bugs align with domain knowledge, share the same codebase, and adhere to the coding style employed in industrial projects.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. R. Just, D. Jalali, and M. D. Ernst, “Defects4j: A database of existing faults to enable controlled testing studies for java programs,” in Proceedings of the 2014 International Symposium on Software Testing and Analysis.   ACM, 2014, pp. 437–440.
  2. C. Le Goues, N. Holtschulte, E. K. Smith, Y. Brun, P. Devanbu, S. Forrest, and W. Weimer, “The ManyBugs and IntroClass benchmarks for automated repair of C programs,” IEEE Transactions on Software Engineering (TSE), vol. 41, no. 12, pp. 1236–1256, December 2015.
  3. S. H. Tan, J. Yi, Yulis, S. Mechtaev, and A. Roychoudhury, “Codeflaws: a programming competition benchmark for evaluating automated program repair tools,” in 2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C), 2017, pp. 180–182.
  4. D. Lin, J. Koppel, A. Chen, and A. Solar-Lezama, “Quixbugs: A multi-lingual program repair benchmark set based on the quixey challenge,” in Proceedings Companion of the 2017 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity, ser. SPLASH Companion 2017.   New York, NY, USA: Association for Computing Machinery, 2017, p. 55–56. [Online]. Available: https://doi.org/10.1145/3135932.3135941
  5. Z. Chen and M. Monperrus, “The codrep machine learning on source code competition,” arXiv, Tech. Rep. 1807.03200, 2018.
  6. K. Liu, D. Kim, A. Koyuncu, L. Li, T. F. Bissyandé, and Y. Le Traon, “A closer look at real-world patches,” in 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME).   IEEE, 2018, pp. 275–286.
  7. F. Madeiral, S. Urli, M. Maia, and M. Monperrus, “Bears: An Extensible Java Bug Benchmark for Automatic Program Repair Studies,” in Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER ’19), 2019.
  8. B. Vancsics, P. Gyimesi, A. Stocco, D. Mazinanian, A. Beszedes, R. Ferenc, and A. Mesbah, “Poster: Supporting javascript experimentation with bugsjs,” in 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST), 2019, pp. 375–378.
  9. D. A. Tomassi, N. Dmeiri, Y. Wang, A. Bhowmick, Y. Liu, P. T. Devanbu, B. Vasilescu, and C. Rubio-González, “Bugswarm: mining and continuously growing a dataset of reproducible failures and fixes,” in ICSE.   IEEE / ACM, 2019, pp. 339–349.
  10. S. Benton, A. Ghanbari, and L. Zhang, “Defexts: A curated dataset of reproducible real-world bugs for modern jvm languages,” in Proceedings of the 41st International Conference on Software Engineering: Companion Proceedings, ser. ICSE ’19.   Piscataway, NJ, USA: IEEE Press, 2019, pp. 47–50. [Online]. Available: https://doi.org/10.1109/ICSE-Companion.2019.00035
  11. Y. Hu, U. Z. Ahmed, S. Mechtaev, B. Leong, and A. Roychoudhury, “Re-factoring based program repair applied to programming assignments,” in 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE).   IEEE/ACM, 2019, pp. 388–398.
  12. R. Widyasari, S. Q. Sim, C. Lok, H. Qi, J. Phan, Q. Tay, C. Tan, F. Wee, J. E. Tan, Y. Yieh, B. Goh, F. Thung, H. J. Kang, T. Hoang, D. Lo, and E. L. Ouh, “Bugsinpy: A database of existing bugs in python programs to enable controlled testing and debugging studies,” in Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2020.   New York, NY, USA: Association for Computing Machinery, 2020, p. 1556–1560. [Online]. Available: https://doi.org/10.1145/3368089.3417943
  13. R.-M. Karampatsis and C. Sutton, “How often do single-statement bugs occur? the manysstubs4j dataset.”   New York, NY, USA: Association for Computing Machinery, 2020. [Online]. Available: https://doi.org/10.1145/3379597.3387491
  14. S. Chakraborty, Y. Ding, M. Allamanis, and B. Ray, “Codit: Code editing with tree-based neural models,” IEEE Transactions on Software Engineering, 2020.
  15. Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, “CodeBERT: A pre-trained model for programming and natural languages,” in Findings of the Association for Computational Linguistics: EMNLP 2020.   Online: Association for Computational Linguistics, Nov. 2020, pp. 1536–1547. [Online]. Available: https://aclanthology.org/2020.findings-emnlp.139
  16. T. Lutellier, H. V. Pham, L. Pang, Y. Li, M. Wei, and L. Tan, “Coconut: Combining context-aware neural translation models using ensemble for program repair,” ser. ISSTA 2020, 2020.
  17. M. Monperrus, M. Martinez, H. Ye, F. Madeiral, T. Durieux, and Z. Yu, “Megadiff: A Dataset of 600k Java Source Code Changes Categorized by Diff Size,” Arxiv, Tech. Rep. 2108.04631, 2021. [Online]. Available: http://arxiv.org/pdf/2108.04631
  18. Q.-C. Bui, R. Scandariato, and N. E. D. Ferreyra, “Vul4j: A dataset of reproducible java vulnerabilities geared towards the study of program repair techniques,” in 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR), 2022, pp. 464–468.
  19. V. Csuvik and L. Vidacs, “Fixjs: A dataset of bug-fixing javascript commits,” in 2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR), 2022, pp. 712–716.
  20. F. Khomh, S. Vaucher, Y.-G. Guéhéneuc, and H. Sahraoui, “A bayesian approach for the detection of code and design smells,” in 2009 Ninth International Conference on Quality Software, 2009, pp. 305–314.
  21. X. Xia, D. Lo, E. Shihab, X. Wang, and X. Yang, “Elblocker: Predicting blocking bugs with ensemble imbalance learning,” Information and Software Technology, vol. 61, pp. 93–106, 2015. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0950584914002602
  22. M. Yan, X. Xia, Y. Fan, A. E. Hassan, D. Lo, and S. Li, “Just-in-time defect identification and localization: A two-phase framework,” IEEE Transactions on Software Engineering, vol. 48, no. 1, pp. 82–101, 2022.
  23. A. Riboira and R. Abreu, “The gzoltar project: A graphical debugger interface,” ser. TAIC PART’10.   Berlin, Heidelberg: Springer-Verlag, 2010.
  24. R. Abreu, P. Zoeteweij, and A. J. van Gemund, “On the accuracy of spectrum-based fault localization,” in Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007), 2007, pp. 89–98.
  25. X. Xia, D. Lo, S. J. Pan, N. Nagappan, and X. Wang, “Hydra: Massively compositional model for cross-project defect prediction,” IEEE Transactions on Software Engineering, vol. 42, no. 10, pp. 977–998, 2016.
  26. K. Liu, J. Zhang, L. Li, A. Koyuncu, D. Kim, C. Ge, Z. Liu, J. Klein, and T. F. Bissyandé, “Reliable fix patterns inferred from static checkers for automated program repair,” ACM Trans. Softw. Eng. Methodol., vol. 32, no. 4, may 2023. [Online]. Available: https://doi.org/10.1145/3579637
  27. S. H. Tan, H. Yoshida, M. R. Prasad, and A. Roychoudhury, “Anti-patterns in search-based program repair,” in Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2016, 2016.
  28. K. Liu, D. Kim, T. F. Bissyandé, S. Yoo, and Y. Le Traon, “Mining fix patterns for findbugs violations,” IEEE Transactions on Software Engineering, vol. 47, no. 1, pp. 165–188, 2021.
  29. A. Koyuncu, K. Liu, T. F. Bissyandé, D. Kim, J. Klein, M. Monperrus, and Y. Le Traon, “Fixminer: Mining relevant fix patterns for automated program repair,” Empirical Softw. Engg., vol. 25, no. 3, p. 1980–2024, may 2020. [Online]. Available: https://doi.org/10.1007/s10664-019-09780-z
  30. C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer, “Genprog: A generic method for automatic software repair,” Software Engineering, IEEE Transactions on, vol. 38, no. 1, pp. 54–72, 2012.
  31. A. Koyuncu, K. Liu, T. F. Bissyandé, D. Kim, M. Monperrus, J. Klein, and Y. Le Traon, “Ifixr: Bug report driven program repair,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2019.   New York, NY, USA: Association for Computing Machinery, 2019, p. 314–325. [Online]. Available: https://doi.org/10.1145/3338906.3338935
  32. N. Jiang, T. Lutellier, and L. Tan, “Cure: Code-aware neural machine translation for automatic program repair,” in Proceedings of the ACM/IEEE 43rd International Conference on Software Engineering, 2021.
  33. M. Wen, J. Chen, R. Wu, D. Hao, and S.-C. Cheung, “Context-aware patch generation for better automated program repair,” in Proceedings of the 40th International Conference on Software Engineering, ser. ICSE ’18, 2018.
  34. K. Liu, A. Koyuncu, D. Kim, and T. F. Bissyandé, “TBar: Revisiting template-based automated program repair,” in Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis.   ACM, 2019, pp. 31–42.
  35. H. Ye, M. Martinez, and M. Monperrus, “Neural program repair with execution-based backpropagation,” in Proceedings of the ACM/IEEE 44th International Conference on Software Engineering, 2022.
  36. H. Tian, Y. Li, W. Pian, A. K. Kaboré, K. Liu, A. Habib, J. Klein, and T. F. Bissyandé, “Predicting patch correctness based on the similarity of failing test cases,” ACM Trans. Softw. Eng. Methodol., vol. 31, no. 4, aug 2022. [Online]. Available: https://doi.org/10.1145/3511096
  37. H. Ye, J. Gu, M. Martinez, T. Durieux, and M. Monperrus, “Automated classification of overfitting patches with statically extracted code features,” IEEE Transactions on Software Engineering, 2021.
  38. H. Tian, K. Liu, A. K. Kaboré, A. Koyuncu, L. Li, J. Klein, and T. F. Bissyandé, “Evaluating representation learning of code changes for predicting patch correctness in program repair,” in Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering.   IEEE, 2020, pp. 981–992.
  39. H. Tian, Y. Li, W. Pian, A. K. Kaboré, K. Liu, J. Klein, and T. F. Bissyande, “Checking patch behaviour against test specification,” ACM Trans. Softw. Eng. Methodol., 2022.
  40. H. Ye, M. Martinez, T. Durieux, and M. Monperrus, “A comprehensive study of automatic program repair on the quixbugs benchmark,” Journal of Systems and Software, vol. 171, p. 110825, 2021.
  41. Z. Chen, S. J. Kommrusch, M. Tufano, L. Pouchet, D. Poshyvanyk, and M. Monperrus, “Sequencer: Sequence-to-sequence learning for end-to-end program repair,” IEEE Transactions on Software Engineering, 2019.
  42. J. Bader, A. Scott, M. Pradel, and S. Chandra, “Getafix: Learning to fix bugs automatically,” Proc. ACM Program. Lang., vol. 3, no. OOPSLA, oct 2019. [Online]. Available: https://doi.org/10.1145/3360585
  43. D. Drain, C. Wu, A. Svyatkovskiy, and N. Sundaresan, “Generating bug-fixes using pretrained transformers,” in Proceedings of the 5th ACM SIGPLAN International Symposium on Machine Programming, ser. MAPS 2021.   New York, NY, USA: Association for Computing Machinery, 2021, p. 1–8. [Online]. Available: https://doi.org/10.1145/3460945.3464951
  44. G. Antoniol, K. Ayari, M. Di Penta, F. Khomh, and Y.-G. Guéhéneuc, “Is it a bug or an enhancement? a text-based approach to classify change requests,” in Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting of Minds, ser. CASCON ’08.   New York, NY, USA: Association for Computing Machinery, 2008. [Online]. Available: https://doi.org/10.1145/1463788.1463819
  45. H. Ye, M. Martinez, X. Luo, T. Zhang, and M. Monperrus, “Selfapr: Self-supervised program repair with test execution diagnostics,” in 37th IEEE/ACM International Conference on Automated Software Engineering.   Association for Computing Machinery, 2022.
  46. G. Gousios, B. Vasilescu, A. Serebrenik, and A. Zaidman, “Lean ghtorrent: Github data on demand,” in Proceedings of the 11th Working Conference on Mining Software Repositories, ser. MSR 2014.   New York, NY, USA: Association for Computing Machinery, 2014, p. 384–387. [Online]. Available: https://doi.org/10.1145/2597073.2597126
  47. W. Jin and A. Orso, “Bugredux: Reproducing field failures for in-house debugging,” in Proceedings of the 34th International Conference on Software Engineering, ser. ICSE ’12.   IEEE Press, 2012, p. 474–484.
  48. J. Liang, M. Wang, Y. Chen, Y. Jiang, and R. Zhang, “Fuzz testing in practice: Obstacles and solutions,” in 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), 2018, pp. 562–566.
  49. H. Shi, R. Wang, Y. Fu, M. Wang, X. Shi, X. Jiao, H. Song, Y. Jiang, and J. Sun, “Industry practice of coverage-guided enterprise linux kernel fuzzing,” in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2019.   New York, NY, USA: Association for Computing Machinery, 2019, p. 986–995. [Online]. Available: https://doi.org/10.1145/3338906.3340460
  50. J. Gao, Y. Xu, Y. Jiang, Z. Liu, W. Chang, X. Jiao, and J. Sun, “Em-fuzz: Augmented firmware fuzzing via memory checking,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 11, pp. 3420–3432, 2020.
  51. M.-C. Hsueh, T. Tsai, and R. Iyer, “Fault injection techniques and tools,” Computer, vol. 30, no. 4, pp. 75–82, 1997.
  52. N. K. Salih, D. Satyanarayana, A. S. Alkalbani, and R. Gopal, “A survey on software/hardware fault injection tools and techniques,” in 2022 IEEE Symposium on Industrial Electronics & Applications (ISIEA), 2022, pp. 1–7.
  53. Y. Jia and M. Harman, “An analysis and survey of the development of mutation testing,” IEEE transactions on software engineering, vol. 37, no. 5, pp. 649–678, 2010.
  54. B. Dolan-Gavitt, P. Hulin, E. Kirda, T. Leek, A. Mambretti, W. Robertson, F. Ulrich, and R. Whelan, “Lava: Large-scale automated vulnerability addition,” in 2016 IEEE symposium on security and privacy (SP).   IEEE, 2016, pp. 110–121.
  55. J. Pewny and T. Holz, “Evilcoder: automated bug insertion,” in Proceedings of the 32nd Annual Conference on Computer Security Applications, 2016, pp. 214–225.
  56. S. Roy, A. Pandey, B. Dolan-Gavitt, and Y. Hu, “Bug synthesis: Challenging bug-finding tools with deep faults,” in Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 224–234.
  57. M. Allamanis, H. Jackson-Flux, and M. Brockschmidt, “Self-supervised bug detection and repair,” in NeurIPS, 2021.
  58. M. Yasunaga and P. Liang, “Graph-based, self-supervised program repair from diagnostic feedback,” in International Conference on Machine Learning (ICML), 2020.
  59. Y. Jia and M. Harman, “An analysis and survey of the development of mutation testing,” IEEE Transactions on Software Engineering, vol. 37, no. 5, pp. 649–678, 2011.
  60. R. Just, B. Kurtz, and P. Ammann, “Inferring mutant utility from program context,” in Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2017.   ACM, 2017, p. 284–294.
  61. J. Patra and M. Pradel, “Semantic bug seeding: a learning-based approach for creating realistic bugs,” in Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021, pp. 906–918.
  62. C. S. Xia and L. Zhang, “Less training, more repairing please: Revisiting automated program repair via zero-shot learning,” in Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ser. ESEC/FSE 2022.   New York, NY, USA: Association for Computing Machinery, 2022, p. 959–971. [Online]. Available: https://doi.org/10.1145/3540250.3549101
  63. C. S. Xia, Y. Wei, and L. Zhang, “Automated program repair in the era of large pre-trained language models,” in Proceedings of the 45th International Conference on Software Engineering, ser. ICSE 2023.   Association for Computing Machinery, 2023.
  64. H. Tian, W. Lu, T. O. Li, X. Tang, S.-C. Cheung, J. Klein, and T. F. Bissyandé, “Is chatgpt the ultimate programming assistant – how far is it?” 2023.
  65. Z. Chen and M. Monperrus, “The codrep machine learning on source code competition,” 2018.
  66. M. Papadakis, M. Kintis, J. Zhang, Y. Jia, Y. Le Traon, and M. Harman, “Mutation testing advances: an analysis and survey,” in Advances in Computers.   Elsevier, 2019, vol. 112, pp. 275–378.
  67. T. Boland and P. E. Black, “Juliet 1. 1 c/c++ and java test suite,” Computer, vol. 45, no. 10, pp. 88–90, 2012.
  68. S. Shiraishi, V. Mohan, and H. Marimuthu, “Test suites for benchmarks of static analysis tools,” in 2015 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW).   IEEE, 2015, pp. 12–15.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. He Ye (16 papers)
  2. Zimin Chen (13 papers)
  3. Claire Le Goues (34 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com