Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An Investigation into Misuse of Java Security APIs by Large Language Models (2404.03823v1)

Published 4 Apr 2024 in cs.CR, cs.CL, and cs.CY

Abstract: The increasing trend of using LLMs for code generation raises the question of their capability to generate trustworthy code. While many researchers are exploring the utility of code generation for uncovering software vulnerabilities, one crucial but often overlooked aspect is the security Application Programming Interfaces (APIs). APIs play an integral role in upholding software security, yet effectively integrating security APIs presents substantial challenges. This leads to inadvertent misuse by developers, thereby exposing software to vulnerabilities. To overcome these challenges, developers may seek assistance from LLMs. In this paper, we systematically assess ChatGPT's trustworthiness in code generation for security API use cases in Java. To conduct a thorough evaluation, we compile an extensive collection of 48 programming tasks for 5 widely used security APIs. We employ both automated and manual approaches to effectively detect security API misuse in the code generated by ChatGPT for these tasks. Our findings are concerning: around 70% of the code instances across 30 attempts per task contain security API misuse, with 20 distinct misuse types identified. Moreover, for roughly half of the tasks, this rate reaches 100%, indicating that there is a long way to go before developers can rely on ChatGPT to securely implement security API code.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (80)
  1. The most dangerous code in the world: validating ssl certificates in non-browser software. In Proceedings of the 2012 ACM conference on Computer and communications security, pages 38–49, 2012.
  2. An empirical study of cryptographic misuse in android applications. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 73–84, 2013.
  3. Cryptoguard: High precision detection of cryptographic vulnerabilities in massive-sized java projects. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security, pages 2455–2472, 2019.
  4. Broken Fingers: On the Usage of the Fingerprint API in Android. In NDSS, 2018.
  5. Oauthlint: An empirical study on oauth bugs in android applications. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 293–304. IEEE, 2019.
  6. CrySL: An extensible approach to validating the correct usage of cryptographic APIs. IEEE Transactions on Software Engineering, 47(11):2382–2400, 2019.
  7. The impact of developer experience in using Java cryptography. In 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 1–6. IEEE, 2019.
  8. Helping johnny encrypt: Toward semantic interfaces for cryptographic frameworks. In Proceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software, pages 180–196, 2016.
  9. Developers deserve security warnings, too: On the effect of integrated security advice on cryptographic {{\{{API}}\}} misuse. In Fourteenth Symposium on Usable Privacy and Security (SOUPS 2018), pages 265–281, 2018.
  10. Fluentcrypto: Cryptography in easy mode. In 2021 IEEE International Conference on Software Maintenance and Evolution (ICSME), pages 402–412. IEEE, 2021.
  11. Cdrep: Automatic repair of cryptographic misuses in android applications. In Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pages 711–722, 2016.
  12. A stitch in time: Supporting android developers in writingsecure code. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 1065–1077, 2017.
  13. Example-based vulnerability detection and repair in java code. In Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, pages 190–201, 2022a.
  14. Firebugs: Finding and repairing cryptography api misuses in mobile applications. In 2021 IEEE 45th Annual Computers, Software, and Applications Conference (COMPSAC), pages 1194–1201. IEEE, 2021.
  15. Big code!= big vocabulary: Open-vocabulary models for source code. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, pages 1073–1085, 2020.
  16. OpenAI. Openai devday, opening keynote, 2023a. URL https://www.youtube.com/watch?v=U9mJuUkhUzk. Accessed December 3, 2023.
  17. Asleep at the keyboard? assessing the security of github copilot’s code contributions. In 2022 IEEE Symposium on Security and Privacy (SP), pages 754–768. IEEE, 2022.
  18. How secure is code generated by chatgpt? arXiv preprint arXiv:2304.09655, 2023.
  19. Chamila Wijayarathna and Nalin Asanka Gamagedara Arachchilage. Using cognitive dimensions to evaluate the usability of security APIs: an empirical investigation. Information and Software Technology, 115:5–19, 2019a.
  20. Towards the usability evaluation of security APIs. In Clarke, Furnell (Eds.): Tenth International Symposium on Human Aspects of Information Security & Assurance (HAISA 2016), Frankfurt, Germany, July 19-21, 2016, pages 252–265. CSCAN, 2016.
  21. Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects. TechRxiv, 2023.
  22. Large language models for software engineering: A systematic literature review. arXiv preprint arXiv:2308.10620, 2023.
  23. Generating secure hardware using chatgpt resistant to cwes. Cryptology ePrint Archive, 2023.
  24. Investigating code generation performance of chat-gpt with crowdsourcing social data. In Proceedings of the 47th IEEE Computer Software and Applications Conference, pages 1–10, 2023.
  25. Is chatgpt the ultimate programming assistant–how far is it? arXiv preprint arXiv:2304.11938, 2023.
  26. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation. arXiv preprint arXiv:2305.01210, 2023a.
  27. No need to lift a finger anymore? assessing the quality of code generation by chatgpt. arXiv preprint arXiv:2308.04838, 2023b.
  28. Improving chatgpt prompt for code generation. arXiv preprint arXiv:2305.08360, 2023c.
  29. Self-collaboration code generation via chatgpt. arXiv preprint arXiv:2304.07590, 2023.
  30. master of code. Statistics of chatgpt & generative ai in business: 2023 report, 2023. URL https://masterofcode.com/blog/statistics-of-chatgpt-generative-ai-in-business-2023-report. Accessed November 9, 2023.
  31. R OpenAI. Gpt-4 technical report. arXiv, pages 2303–08774, 2023b.
  32. An empirical study of code smells in transformer-based code generation techniques. In 2022 IEEE 22nd International Working Conference on Source Code Analysis and Manipulation (SCAM), pages 71–82. IEEE, 2022.
  33. Security weaknesses of copilot generated code in github. arXiv preprint arXiv:2310.02059, 2023.
  34. Is github’s copilot as bad as humans at introducing vulnerabilities in code? Empirical Software Engineering, 28(6):1–24, 2023.
  35. Lost at c: A user study on the security implications of large language model code assistants. In 32nd USENIX Security Symposium (USENIX Security 23), pages 2205–2222, 2023.
  36. Do users write more insecure code with ai assistants? In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, pages 2785–2799, 2023.
  37. Usability implications of requiring parameters in objects’ constructors. In 29th International Conference on Software Engineering (ICSE’07), pages 529–539. IEEE, 2007.
  38. Detecting misuses of security apis: A systematic review. arXiv preprint arXiv:2306.08869, 2023.
  39. Google. Fingerprint api, a. URL https://developer.android.com/reference/android/hardware/fingerprint/package-summary. Accessed October 29, 2023.
  40. Google. Safetynet attestation, b. URL https://developer.android.com/privacy-and-security/safetynet/attestation. Accessed October 29, 2023.
  41. Google. Biometrics api, c. URL https://developer.android.com/reference/android/hardware/biometrics/package-summary. Accessed October 29, 2023.
  42. Google. Play integrity, d. URL https://developer.android.com/google/play/integrity. Accessed October 29, 2023.
  43. MIKE MELANSON. Don’t call it a comeback: Why java is still champ, 2022. URL https://github.com/readme/featured/java-programming-language?utm_source=github&utm_medium=referral&utm_campaign=&scid=&utm_content=octoverse. Accessed October 26, 2023.
  44. Pierre Carbonnelle. Pypl popularity of programming language, 2023. URL https://pypl.github.io/PYPL.html. Accessed October 29, 2023.
  45. Comparing the usability of cryptographic APIs. In 2017 IEEE Symposium on Security and Privacy (SP), pages 154–171. IEEE, 2017a.
  46. Why johnny can’t store passwords securely? a usability evaluation of bouncycastle password hashing. In Proceedings of the 22nd International Conference on Evaluation and Assessment in Software Engineering 2018, pages 205–210, 2018.
  47. {{\{{Let’s}}\}} hash: Helping developers with password security. In Eighteenth Symposium on Usable Privacy and Security (SOUPS 2022), pages 503–522, 2022.
  48. Security developer studies with {{\{{GitHub}}\}} users: Exploring a convenience sample. In Thirteenth Symposium on Usable Privacy and Security (SOUPS 2017), pages 81–95, 2017b.
  49. Aligning a serious game, secure programming and cybok-linked learning outcomes. In 2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), pages 486–495. IEEE, 2022.
  50. You get where you’re looking for: The impact of information sources on code security. In 2016 IEEE Symposium on Security and Privacy (SP), pages 289–305. IEEE, 2016.
  51. On conducting security developer studies with cs students: Examining a password-storage study with cs students, freelancers, and company developers. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, pages 1–13, 2020.
  52. Chamila Wijayarathna and Nalin Asanka Gamagedara Arachchilage. Why johnny can’t develop a secure application? a usability analysis of java secure socket extension api. Computers & Security, 80:54–73, 2019b.
  53. Google. Oauth api, 2023. URL https://developers.google.com/identity/protocols/oauth2/native-app. Accessed September 28, 2023.
  54. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv preprint arXiv:2302.11382, 2023.
  55. Security in the software development lifecycle. In SOUPS@ USENIX Security Symposium, pages 281–296, 2018.
  56. CryptoAPI-Bench, 2019. URL https://github.com/CryptoGuardOSS/cryptoapi-bench. Accessed June 10, 2023.
  57. ApacheCryptoAPI-Bench, 2020. URL https://github.com/CryptoAPI-Bench/ApacheCryptoAPI-Bench. Accessed June 10, 2023.
  58. Evaluation of static vulnerability detection tools with java cryptographic api benchmarks. IEEE Transactions on Software Engineering, 49(2):485–497, 2022.
  59. Automatic detection of Java cryptographic API misuses: Are we there yet? IEEE Transactions on Software Engineering, 49(1):288–303, 2022b.
  60. MUBench, 2016. URL https://GitHub.com/stg-tud/MUBench. Accessed June 10, 2023.
  61. Python crypto misuses in the wild. In Proceedings of the 15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), pages 1–6, 2021.
  62. Recommendation for key management part 3: Application-specific key management guidance. NIST special publication, 800:57, 2009.
  63. Inferring crypto api rules from code changes. ACM SIGPLAN Notices, 53(4):450–464, 2018.
  64. Burt Kaliski and A Rusch. RFC 8018: PKCS# 5: Password-based cryptography specification version 2.1, 2017.
  65. Yarik markov, alex petit bianco, and clement baisse. announcing the first sha1 collision. Google Security Blog, https://security. googleblog. com/2017/02/announcing-first-sha1-collision. html, 2017.
  66. Identifying vulnerabilities of ssl/tls certificate verification in android apps with static and dynamic analysis. Journal of Systems and Software, 167:110609, 2020.
  67. Why eve and mallory love android: An analysis of android ssl (in) security. In Proceedings of the 2012 ACM conference on Computer and communications security, pages 50–61, 2012.
  68. Oauth demystified for mobile application developers. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pages 892–903, 2014.
  69. Dick Hardt. RFC 6749: The OAuth 2.0 authorization framework, 2012.
  70. Google. Googlenethttptransport, e. URL https://cloud.google.com/java/docs/reference/google-api-client/latest/com.google.api.client.googleapis.javanet.GoogleNetHttpTransport. Accessed October 13, 2023.
  71. Google. Nethttptransport, f. URL https://cloud.google.com/java/docs/reference/google-http-client/latest/com.google.api.client.http.javanet.NetHttpTransport. Accessed October 13, 2023.
  72. The devil is in the (implementation) details: an empirical analysis of oauth sso systems. In Proceedings of the 2012 ACM conference on Computer and communications security, pages 378–390, 2012.
  73. Vulnerability assessment of oauth implementations in android applications. In Proceedings of the 31st annual computer security applications conference, pages 61–70, 2015.
  74. Best current practices for oauth/oidc native apps: A study of their adoption in popular providers and top-ranked android clients. Journal of Information Security and Applications, 65:103097, 2022.
  75. Proof key for code exchange by OAuth public clients. Technical report, Internet Engineering Task Force (IETF), 2015.
  76. Google. Cryptography for biometric authentication, g. URL https://developer.android.com/reference/android/hardware/biometrics/BiometricPrompt.CryptoObject. Accessed October 15, 2023.
  77. OWASP. Android local authentication. URL https://github.com/OWASP/owasp-mastg/blob/master/Document/0x05f-Testing-Local-Authentication. Accessed October 15, 2023.
  78. Resources for the research on ”Evaluating the Trustworthiness of Large Language Models in Generating Secure Security API Code”, 2023. URL https://github.com/LLM-security-study/ChatGPT.
  79. Breaking the silence: the threats of using llms in software engineering. In ACM/IEEE 46th International Conference on Software Engineering - New Ideas and Emerging Results. ACM/IEEE, January 2024. URL https://conf.researchr.org/home/icse-2024. ACM/IEEE 46th International Conference on Software Engineering, ICSE ’24 ; Conference date: 14-04-2024 Through 20-04-2024.
  80. Google. OAuth 2.0 for Mobile & Desktop Apps, h. URL https://developers.google.com/identity/protocols/oauth2/native-app. Accessed December 4, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Zahra Mousavi (6 papers)
  2. Chadni Islam (10 papers)
  3. Kristen Moore (36 papers)
  4. Alsharif Abuadbba (48 papers)
  5. Muhammad Ali Babar (35 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.