IPSynth: Interprocedural Program Synthesis for Software Security Implementation (2403.10836v1)
Abstract: To implement important quality attributes of software such as architectural security tactics, developers incorporate API of software frameworks, as building blocks, to avoid re-inventing the wheel and improve their productivity. However, this is a challenging and error-prone task, especially for novice programmers. Despite the advances in the field of API-based program synthesis, the state-of-the-art suffers from a twofold shortcoming when it comes to architectural tactic implementation tasks. First, the specification of the desired tactic must be explicitly expressed, which is out of the knowledge of such programmers. Second, these approaches synthesize a block of code and leave the task of breaking it down into smaller pieces, adding each piece to the proper location in the code, and establishing correct dependencies between each piece and its surrounding environment as well as the other pieces, to the programmer. To mitigate these challenges, we introduce IPSynth, a novel inter-procedural program synthesis approach that automatically learns the specification of the tactic, synthesizes the tactic as inter-related code snippets, and adds them to an existing code base. We extend our first-place award-winning extended abstract recognized at the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE'21) research competition track. In this paper, we provide the details of the approach, present the results of the experimental evaluation of IPSynth, and analyses and insights for a more comprehensive exploration of the research topic. Moreover, we compare the results of our approach to one of the most powerful code generator tools, ChatGPT. Our results show that our approach can accurately locate corresponding spots in the program, synthesize needed code snippets, add them to the program, and outperform ChatGPT in inter-procedural tactic synthesis tasks.
- M. Mirakhorli and J. Cleland-Huang, “Modifications, tweaks, and bug fixes in architectural tactics,” in Proceedings of the 12th Working Conference on Mining Software Repositories, ser. MSR ’15. IEEE Press, 2015, p. 377–380.
- M. Mirakhorli, Y. Shin, J. Cleland-Huang, and M. Çinar, “A tactic-centric approach for automating traceability of quality concerns,” in 34th International Conference on Software Engineering, ICSE 2012, June 2-9, 2012, Zurich, Switzerland, M. Glinz, G. C. Murphy, and M. Pezzè, Eds. IEEE Computer Society, 2012, pp. 639–649. [Online]. Available: https://doi.org/10.1109/ICSE.2012.6227153
- J. Van Gurp, S. Brinkkemper, and J. Bosch, “Design preservation over subsequent releases of a software product: a case study of baan erp,” Journal of Software Maintenance and Evolution: Research and Practice, vol. 17, no. 4, pp. 277–306, 2005.
- R. Gopalakrishnan, P. Sharma, M. Mirakhorli, and M. Galster, “Can latent topics in source code predict missing architectural tactics?” in Proceedings of the 39th International Conference on Software Engineering, ICSE 2017, Buenos Aires, Argentina, May 20-28, 2017, S. Uchitel, A. Orso, and M. P. Robillard, Eds. IEEE / ACM, 2017, pp. 15–26. [Online]. Available: https://doi.org/10.1109/ICSE.2017.10
- I. Rehman, M. Mirakhorli, M. Nagappan, A. A. Uulu, and M. Thornton, “Roles and impacts of hands-on software architects in five industrial case studies,” in Proceedings of the 40th International Conference on Software Engineering, 2018, pp. 117–127.
- J. Garcia, M. Mirakhorli, L. Xiao, Y. Zhao, I. Mujhid, K. Pham, A. Okutan, S. Malek, R. Kazman, Y. Cai, and N. Medvidovic, “Constructing a shared infrastructure for software architecture analysis and maintenance,” in 18th IEEE International Conference on Software Architecture, ICSA 2021, Stuttgart, Germany, March 22-26, 2021. IEEE, 2021, pp. 150–161. [Online]. Available: https://doi.org/10.1109/ICSA51549.2021.00022
- A. Shokri, J. C. S. Santos, and M. Mirakhorli, “Arcode: Facilitating the use of application frameworks to implement tactics and patterns,” in 2021 IEEE 18th International Conference on Software Architecture (ICSA), 2021, pp. 138–149.
- H. Cervantes, P. Velasco-Elizondo, and R. Kazman, “A principled way to use frameworks in architecture design,” IEEE software, vol. 30, no. 2, pp. 46–53, 2012.
- “Java authentication and authorization services (jaas),” https://docs.oracle.com/en/java/javase/16/security/jaas-authentication.html, accessed: 2021-08-21.
- S. Gulwani, O. Polozov, R. Singh et al., “Program synthesis,” Foundations and Trends® in Programming Languages, vol. 4, no. 1-2, pp. 1–119, 2017.
- Y. Feng, R. Martins, Y. Wang, I. Dillig, and T. W. Reps, “Component-based synthesis for complex apis,” in Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, 2017, pp. 599–612.
- Z. Yang, J. Hua, K. Wang, and S. Khurshid, “Edsynth: Synthesizing api sequences with conditionals and loops,” in 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 2018, pp. 161–171.
- K. Shi, J. Steinhardt, and P. Liang, “Frangel: component-based synthesis with control structures,” Proceedings of the ACM on Programming Languages, vol. 3, no. POPL, pp. 1–29, 2019.
- Z. Guo, M. James, D. Justo, J. Zhou, Z. Wang, R. Jhala, and N. Polikarpova, “Program synthesis by type-guided abstraction refinement,” Proceedings of the ACM on Programming Languages, vol. 4, no. POPL, pp. 1–28, 2019.
- B.-B. Liu, W. Dong, J.-X. Liu, Y.-T. Zhang, and D.-Y. Wang, “Prosy: Api-based synthesis with probabilistic model,” Journal of Computer Science and Technology, vol. 35, no. 6, pp. 1234–1257, 2020.
- J. Liu, B. Liu, W. Dong, Y. Zhang, and D. Wang, “How much support can api recommendation methods provide for component-based synthesis?” in 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC). IEEE, 2020, pp. 872–881.
- A. Shokri, “A program synthesis approach for adding architectural tactics to an existing code base,” in 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2021, pp. 1388–1390.
- OpenAI, “Gpt-4 technical report,” ArXiv, vol. abs/2303.08774, 2023.
- A. Shokri and M. Mirakhorli, “Arcode: A tool for supporting comprehension and implementation of architectural concerns,” in 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC), 2021, pp. 485–489.
- U. Alon, M. Zilberstein, O. Levy, and E. Yahav, “code2vec: Learning distributed representations of code,” Proceedings of the ACM on Programming Languages, vol. 3, no. POPL, pp. 1–29, 2019.
- V. I. Levenshtein et al., “Binary codes capable of correcting deletions, insertions, and reversals,” in Soviet physics doklady, vol. 10, no. 8. Soviet Union, 1966, pp. 707–710.
- D. v. Bruggen, F. Tomassetti, R. Howell et al., “avaparser/javaparser: Release javaparser-parent-3.16.1,” May 2020. [Online]. Available: https://doi.org/10.5281/zenodo.3842713
- “T.j. watson libraries for analysis (wala),” http://wala.sourceforge.net, accessed: 2021-08-19.
- E. Yourdon and L. L. Constantine, “Structured design. fundamentals of a discipline of computer program and systems design,” Englewood Cliffs: Yourdon Press, 1979.
- L. De Moura and N. Bjørner, “Z3: An efficient smt solver,” in International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 2008, pp. 337–340.
- K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th annual meeting of the Association for Computational Linguistics, 2002, pp. 311–318.
- S. Ren, D. Guo, S. Lu, L. Zhou, S. Liu, D. Tang, N. Sundaresan, M. Zhou, A. Blanco, and S. Ma, “Codebleu: a method for automatic evaluation of code synthesis,” arXiv preprint arXiv:2009.10297, 2020.
- Y. Lustig and M. Y. Vardi, “Synthesis from component libraries,” in International Conference on Foundations of Software Science and Computational Structures. Springer, 2009, pp. 395–409.
- S. Jha, S. Gulwani, S. A. Seshia, and A. Tiwari, “Oracle-guided component-based program synthesis,” in 2010 ACM/IEEE 32nd International Conference on Software Engineering, vol. 1. IEEE, 2010, pp. 215–224.
- A. Taly, S. Gulwani, and A. Tiwari, “Synthesizing switching logic using constraint solving,” International journal on software tools for technology transfer, vol. 13, no. 6, pp. 519–535, 2011.
- F. DeMarco, J. Xuan, D. Le Berre, and M. Monperrus, “Automatic repair of buggy if conditions and missing preconditions with smt,” in Proceedings of the 6th international workshop on constraints in software testing, verification, and analysis, 2014, pp. 30–39.
- Y. Zhang, W. Dong, D. Wang, J. Liu, and B. Liu, “Probabilistic synthesis for program with non-api operations,” in 2020 IEEE 20th International Conference on Software Quality, Reliability and Security Companion (QRS-C). IEEE, 2020, pp. 451–457.
- A. Albarghouthi, S. Gulwani, and Z. Kincaid, “Recursive program synthesis,” in International conference on computer aided verification. Springer, 2013, pp. 934–950.
- R. Bavishi, C. Lemieux, R. Fox, K. Sen, and I. Stoica, “Autopandas: neural-backed generators for program synthesis,” Proceedings of the ACM on Programming Languages, vol. 3, no. OOPSLA, pp. 1–27, 2019.
- A. Iannopollo, S. Tripakis, and A. Sangiovanni-Vincentelli, “Constrained synthesis from component libraries,” Science of Computer Programming, vol. 171, pp. 21–41, 2019.
- K. M. Ellis, M. Nye, Y. Pu, F. Sosa, J. Tenenbaum, and A. Solar-Lezama, “Write, execute, assess: Program synthesis with a repl,” 2019.
- Z. Liang and K. Tsushima, “Component-based program synthesis in ocaml,” 2017.
- B. Liu, W. Dong, Y. Zhang, D. Wang, and J. Liu, “Boosting component-based synthesis with control structure recommendation,” in Proceedings of the 1st ACM SIGSOFT International Workshop on Representation Learning for Software Engineering and Program Languages, 2020, pp. 19–28.
- Y. Feng, R. Martins, J. Van Geffen, I. Dillig, and S. Chaudhuri, “Component-based synthesis of table consolidation and transformation tasks from examples,” ACM SIGPLAN Notices, vol. 52, no. 6, pp. 422–436, 2017.
- A. Gascón, A. Tiwari, B. Carmer, and U. Mathur, “Look for the proof to find the program: Decorated-component-based program synthesis,” in International Conference on Computer Aided Verification. Springer, 2017, pp. 86–103.
- A. Tiwari, A. Gascón, and B. Dutertre, “Program synthesis using dual interpretation,” in International Conference on Automated Deduction. Springer, 2015, pp. 482–497.
- S. Bhupatiraju, R. Singh, A.-r. Mohamed, and P. Kohli, “Deep api programmer: Learning to program with apis,” arXiv preprint arXiv:1704.04327, 2017.
- X. Si, W. Lee, R. Zhang, A. Albarghouthi, P. Koutris, and M. Naik, “Syntax-guided synthesis of datalog programs,” in Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 515–527.
- M. B. James, Z. Guo, Z. Wang, S. Doshi, H. Peleg, R. Jhala, and N. Polikarpova, “Digging for fold: synthesis-aided api discovery for haskell,” Proceedings of the ACM on Programming Languages, vol. 4, no. OOPSLA, pp. 1–27, 2020.
- Y. Takashima, R. Martins, L. Jia, and C. S. Păsăreanu, “Syrust: automatic testing of rust libraries with semantic-aware program synthesis,” in Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021, pp. 899–913.
- S. N. Guria, J. S. Foster, and D. Van Horn, “Rbsyn: type-and effect-guided program synthesis,” in Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, 2021, pp. 344–358.
- T. Knoth, D. Wang, N. Polikarpova, and J. Hoffmann, “Resource-guided program synthesis,” in Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, 2019, pp. 253–268.
- B. Liu, W. Dong, and Y. Zhang, “Accelerating api-based program synthesis via api usage pattern mining,” IEEE Access, vol. 7, pp. 159 162–159 176, 2019.
- J. Liu, W. Dong, and B. Liu, “Boosting component-based synthesis with api usage knowledge,” in Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering Workshops, 2020, pp. 91–97.