Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Redundancy-based Automated Program Repair by Fine-grained Pattern Mining (2312.15955v1)

Published 26 Dec 2023 in cs.SE

Abstract: Redundancy-based automated program repair (APR), which generates patches by referencing existing source code, has gained much attention since they are effective in repairing real-world bugs with good interpretability. However, since existing approaches either demand the existence of multi-line similar code or randomly reference existing code, they can only repair a small number of bugs with many incorrect patches, hindering their wide application in practice. In this work, we aim to improve the effectiveness of redundancy-based APR by exploring more effective source code reuse methods for improving the number of correct patches and reducing incorrect patches. Specifically, we have proposed a new repair technique named Repatt, which incorporates a two-level pattern mining process for guiding effective patch generation (i.e., token and expression levels). We have conducted an extensive experiment on the widely-used Defects4J benchmark and compared Repatt with eight state-of-the-art APR approaches. The results show that our approach complements existing approaches by repairing {15} unique bugs compared with the latest deep learning-based methods and {19} unique bugs compared with traditional repair methods when providing the perfect fault localization. In addition, when the perfect fault localization is unknown in real practice, Repatt significantly outperforms the baseline approaches by achieving much higher patch precision, i.e., {83.8\%}. Moreover, we further proposed an effective patch ranking strategy for combining the strength of Repatt and the baseline methods. The result shows that it repairs 124 bugs when only considering the Top-1 patches and improves the best-performing repair method by repairing 39 more bugs. The results demonstrate the effectiveness of our approach for practical use.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (68)
  1. On the accuracy of spectrum-based fault localization. In TAICPART-MUTATION. 89–98.
  2. Compilers: principles, techniques and tools.
  3. A learning-to-rank based fault localization approach using likely invariants. In ISSTA. 177–188.
  4. The plastic surgery hypothesis. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, (FSE-22), Hong Kong, China, November 16 - 22, 2014. 306–317. https://doi.org/10.1145/2635868.2635898
  5. GZoltar: An Eclipse Plug-in for Testing and Debugging (ASE ’12). 378–381. https://doi.org/10.1145/2351676.2351752
  6. Contract-based program repair without the contracts. In ASE. https://doi.org/10.1109/ASE.2017.8115674
  7. Program repair with repeated learning. IEEE Transactions on Software Engineering 49, 2 (2022), 831–848.
  8. Zimin Chen. 2018. The Essence of Similarity in Redundancy-based Program Repair.
  9. Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Transactions on Software Engineering 47, 9 (2019), 1943–1959.
  10. Zimin Chen and Martin Monperrus. 2018. The remarkable role of similarity in redundancy-based program repair. arXiv preprint arXiv:1811.05703 (2018).
  11. Fine-grained and accurate source code differencing. In ASE. 313–324. https://doi.org/10.1145/2642937.2642982
  12. Automated Repair of Programs from Large Language Models. In Proceedings of the 45th International Conference on Software Engineering.
  13. Beyond tests: Program vulnerability repair via crash constraint extraction. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 2 (2021), 1–27.
  14. Ali Ghanbari and Lingming Zhang. 2018. Practical Program Repair via Bytecode Mutation. arXiv preprint arXiv:1807.03512 (2018).
  15. On the Naturalness of Software (ICSE ’12). 837–847.
  16. Towards Practical Program Repair with On-demand Candidate Generation. In ICSE.
  17. Shaping Program Repair Space with Existing Patches and Similar Code. In ISSTA.
  18. Impact of Code Language Models on Automated Program Repair. arXiv:2302.05020 [cs.SE]
  19. KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program Repair. arXiv:2302.01857 [cs.SE]
  20. CURE: Code-Aware Neural Machine Translation for Automatic Program Repair. In Proceedings of the 43rd International Conference on Software Engineering. 1161–1173.
  21. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In ISSTA. 437–440.
  22. Repairing Programs with Semantic Code Search. In ASE. 295–306. https://doi.org/10.1109/ASE.2015.60
  23. Codebert-nt: code naturalness via codebert. In 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS). IEEE, 936–947.
  24. Automatic patch generation learned from human-written patches. In ICSE. 802–811.
  25. S3: syntax-and semantic-guided repair synthesis via programming by examples. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. ACM, 593–604.
  26. History Driven Program Repair. In SANER. 213–224. https://doi.org/10.1109/SANER.2016.76
  27. GenProg: A Generic Method for Automatic Software Repair. TSE 38, 1 (Jan 2012), 54–72.
  28. Vladimir I Levenshtein. 1966. Binary codes capable of correcting deletions, insertions, and reversals. In Soviet physics doklady, Vol. 10. 707–710.
  29. Dlfix: Context-based code transformation learning for automated program repair. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. 602–614.
  30. TBar: Revisiting Template-Based Automated Program Repair. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. Association for Computing Machinery, New York, NY, USA, 31–42. https://doi.org/10.1145/3293882.3330577
  31. On the Efficiency of Test Suite Based Program Repair: A Systematic Assessment of 16 Automated Repair Systems for Java Programs. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). 615–627. https://doi.org/10.1145/3377811.3380338
  32. Automatic Inference of Code Transforms for Patch Generation. In ESEC/FSE. 727–739. https://doi.org/10.1145/3106237.3106253
  33. Fan Long and Martin Rinard. 2016. Automatic patch generation by learning correct code. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. 298–312.
  34. Coconut: combining context-aware neural translation models using ensemble for program repair. In Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis. 101–114.
  35. Semantic program repair using a reference implementation. In Proceedings of the 40th International Conference on Software Engineering. 129–139.
  36. DirectFix: Looking for Simple Program Repairs. In ICSE. 448–458. https://doi.org/10.1109/ICSE.2015.63
  37. Angelix: Scalable Multiline Program Patch Synthesis via Symbolic Analysis. In ICSE.
  38. Quality of Automated Program Repair on Real-World Defects. IEEE Transactions on Software Engineering 48, 2 (2022), 637–661. https://doi.org/10.1109/TSE.2020.2998785
  39. SemFix: Program Repair via Semantic Analysis. In ICSE. 772–781.
  40. Automated Fixing of Programs with Contracts. IEEE Transactions on Software Engineering 40, 5 (2014), 427–449.
  41. The Strength of Random Search on Automated Program Repair. In ICSE. 254–265. https://doi.org/10.1145/2568225.2568254
  42. An Analysis of Patch Plausibility and Correctness for Generate-and-validate Patch Generation Systems (ISSTA 2015). 24–36.
  43. Software clone detection: A systematic review. Information and Software Technology 55, 7 (2013), 1165–1199.
  44. On the” naturalness” of buggy code. In Proceedings of the 38th International Conference on Software Engineering. 428–439.
  45. Bugs.Jar: A Large-Scale, Diverse Dataset of Real-World Java Bugs. In Proceedings of the 15th International Conference on Mining Software Repositories (MSR ’18). 10–13. https://doi.org/10.1145/3196398.3196473
  46. ELIXIR: Effective Object Oriented Program Repair. In ASE (Urbana-Champaign, IL, USA). IEEE Press. http://dl.acm.org/citation.cfm?id=3155562.3155643
  47. Seemanta Saha et al. 2019. Harnessing evolution for multi-hunk program repair. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 13–24.
  48. Gerald Salton (Ed.). 1988. Automatic Text Processing. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.
  49. Anti-patterns in Search-Based Program Repair. In FSE. https://doi.org/10.1145/2950290.2950295
  50. On the Localness of Software. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2014). 269–280. https://doi.org/10.1145/2635868.2635875
  51. How Different Is It Between Machine-Generated and Developer-Provided Patches? : An Empirical Study on the Correct Patches Generated by Automated Program Repair Techniques. In 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM). 1–12. https://doi.org/10.1109/ESEM.2019.8870172
  52. Leveraging program equivalence for adaptive program repair: Models and first results. In ASE. 356–366. https://doi.org/10.1109/ASE.2013.6693094
  53. Automatically finding patches using genetic programming. In ICSE. 364–374. https://doi.org/10.1109/ICSE.2009.5070536
  54. Context-Aware Patch Generation for Better Automated Program Repair. In ICSE.
  55. Sorting and transforming program repair ingredients via deep learning code similarities. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 479–490.
  56. Automated program repair in the era of large pre-trained language models. In Proceedings of the 45th International Conference on Software Engineering.
  57. Qi Xin and Steven P. Reiss. 2017a. Identifying Test-Suite-Overfitted Patches through Test Case Generation. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2017). Association for Computing Machinery, New York, NY, USA, 226–236. https://doi.org/10.1145/3092703.3092718
  58. Qi Xin and Steven P. Reiss. 2017b. Leveraging Syntax-related Code for Automated Program Repair (ASE). http://dl.acm.org/citation.cfm?id=3155562.3155644
  59. Identifying Patch Correctness in Test-Based Program Repair. In ICSE.
  60. Yingfei Xiong and Bo Wang. 2022. L2S: A Framework for Synthesizing the Most Probable Program under a Specification. ACM Trans. Softw. Eng. Methodol. 31, 3, Article 34 (2022), 45 pages. https://doi.org/10.1145/3487570
  61. Precise Condition Synthesis for Program Repair. In ICSE. https://doi.org/10.1109/ICSE.2017.45
  62. Nopol: Automatic Repair of Conditional Statement Bugs in Java Programs. TSE (2017).
  63. Chen Yang. 2021. Accelerating Redundancy-Based Program Repair via Code Representation Learning and Adaptive Patch Filtering. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2021). 1672–1674. https://doi.org/10.1145/3468264.3473496
  64. TransplantFix: Graph Differencing-Based Code Transplantation for Automated Program Repair. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE ’22). Article 107, 13 pages. https://doi.org/10.1145/3551349.3556893
  65. SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics. In 37th IEEE/ACM International Conference on Automated Software Engineering. 1–13.
  66. Neural program repair with execution-based backpropagation. In Proceedings of the 44th International Conference on Software Engineering. 1506–1518.
  67. Yuan Yuan and Wolfgang Banzhaf. 2018. Arja: Automated repair of java programs via multi-objective genetic programming. IEEE Transactions on software engineering 46, 10 (2018), 1040–1067.
  68. A Syntax-Guided Edit Decoder for Neural Program Repair. In ESEC/FSE. 341–353. https://doi.org/10.1145/3468264.3468544
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jiajun Jiang (15 papers)
  2. Zijie Zhao (9 papers)
  3. Zhirui Ye (3 papers)
  4. Bo Wang (823 papers)
  5. Hongyu Zhang (147 papers)
  6. Junjie Chen (89 papers)