FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs (2403.18403v2)
Abstract: Analyzing the behavior of cryptographic functions in stripped binaries is a challenging but essential task. Cryptographic algorithms exhibit greater logical complexity compared to typical code, yet their analysis is unavoidable in areas such as virus analysis and legacy code inspection. Existing methods often rely on data or structural pattern matching, leading to suboptimal generalizability and suffering from manual work. In this paper, we propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries. In FoC, we first build a binary LLM (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language. The prediction of FoC-BinLLM is insensitive to minor changes, such as vulnerability patches. To mitigate it, we further build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database. In addition, we construct a cryptographic binary dataset for evaluation and to facilitate further research in this domain. And an automated method is devised to create semantic labels for extensive binary functions. Evaluation results demonstrate that FoC-BinLLM outperforms ChatGPT by 14.61% on the ROUGE-L score. FoC-Sim outperforms the previous best methods with a 52% higher Recall@1. Furthermore, our method also shows practical ability in virus analysis and 1-day vulnerability detection.
- Hex-Rays SA, “IDA Pro,” https://www.hex-rays.com/products/ida, 2023.
- A. Al-Kaswan, T. Ahmed, M. Izadi, A. A. Sawant, P. Devanbu, and A. van Deursen, “Extending source code pre-trained language models to summarise decompiled binarie,” in 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2023, pp. 260–271.
- J. Xiong, G. Chen, K. Chen, H. Gao, S. Cheng, and W. Zhang, “Hext5: Unified pre-training for stripped binary code information inference,” in 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2023, pp. 774–786.
- M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. de Oliveira Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, A. Ray, R. Puri, G. Krueger, M. Petrov, H. Khlaaf, G. Sastry, P. Mishkin, B. Chan, S. Gray, N. Ryder, M. Pavlov, A. Power, L. Kaiser, M. Bavarian, C. Winter, P. Tillet, F. P. Such, D. Cummings, M. Plappert, F. Chantzis, E. Barnes, A. Herbert-Voss, W. H. Guss, A. Nichol, A. Paino, N. Tezak, J. Tang, I. Babuschkin, S. Balaji, S. Jain, W. Saunders, C. Hesse, A. N. Carr, J. Leike, J. Achiam, V. Misra, E. Morikawa, A. Radford, M. Knight, M. Brundage, M. Murati, K. Mayer, P. Welinder, B. McGrew, D. Amodei, S. McCandlish, I. Sutskever, and W. Zaremba, “Evaluating large language models trained on code,” 2021.
- B. Wang and A. Komatsuzaki, “GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model,” https://github.com/kingoflolz/mesh-transformer-jax, May 2021.
- S. Black, S. Biderman, E. Hallahan, Q. Anthony, L. Gao, L. Golding, H. He, C. Leahy, K. McDonell, J. Phang, M. Pieler, U. S. Prashanth, S. Purohit, L. Reynolds, J. Tow, B. Wang, and S. Weinbach, “GPT-NeoX-20B: An open-source autoregressive language model,” in Proceedings of BigScience Episode #5 – Workshop on Challenges & Perspectives in Creating Large Language Models. virtual+Dublin: Association for Computational Linguistics, May 2022, pp. 95–136. [Online]. Available: https://aclanthology.org/2022.bigscience-1.9
- F. F. Xu, U. Alon, G. Neubig, and V. J. Hellendoorn, “A systematic evaluation of large language models of code,” in Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming, ser. MAPS 2022. New York, NY, USA: Association for Computing Machinery, 2022, p. 1–10. [Online]. Available: https://doi.org/10.1145/3520312.3534862
- L. Luo, J. Ming, D. Wu, P. Liu, and S. Zhu, “Semantics-based obfuscation-resilient binary code similarity comparison with applications to software and algorithm plagiarism detection,” IEEE Transactions on Software Engineering, vol. 43, no. 12, pp. 1157–1177, 2017.
- L. Massarelli, G. A. Di Luna, F. Petroni, R. Baldoni, and L. Querzoni, “Safe: Self-attentive function embeddings for binary similarity,” in Detection of Intrusions and Malware, and Vulnerability Assessment, R. Perdisci, C. Maurice, G. Giacinto, and M. Almgren, Eds., 2019, pp. 309–329.
- Y. Duan, X. Li, J. Wang, and H. Yin, “Deepbindiff: Learning program-wide code representations for binary diffing,” in Proceedings of the 2020 Network and Distributed Systems Security Symposium (NDSS), 2020.
- X. Li, Q. Yu, and H. Yin, “Palmtree: Learning an assembly language model for instruction embedding,” Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, 2021.
- K. Pei, Z. Xuan, J. Yang, S. Jana, and B. Ray, “Learning approximate execution semantics from traces for binary function similarity,” IEEE Transactions on Software Engineering, vol. 49, no. 04, pp. 2776–2790, apr 2023.
- H. Wang, W. Qu, G. Katz, W. Zhu, Z. Gao, H. Qiu, J. Zhuge, and C. Zhang, “jtrans: jump-aware transformer for binary code similarity detection,” in Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, 2022, pp. 1–13.
- polymorf, “findcrypt-yara,” https://github.com/polymorf/findcrypt-yara, 2022.
- Sirmabus, “Ida_signsrch,” https://github.com/nihilus/IDA_Signsrch, 2015.
- Z. Wang, X. Jiang, W. Cui, X. Wang, and M. Grace, “Reformat: Automatic reverse engineering of encrypted messages,” in Computer Security–ESORICS 2009: 14th European Symposium on Research in Computer Security, Saint-Malo, France, September 21-23, 2009. Proceedings 14. Springer, 2009, pp. 200–215.
- F. Gröbert, C. Willems, and T. Holz, “Automated Identification of Cryptographic Primitives in Binary Programs,” in Recent Advances in Intrusion Detection, R. Sommer, D. Balzarotti, and G. Maier, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011, pp. 41–60.
- L. Benedetti, A. Thierry, and J. Francq, “Detection of cryptographic algorithms with grap,” Cryptology ePrint Archive, Paper 2017/1119, 2017, https://eprint.iacr.org/2017/1119. [Online]. Available: https://eprint.iacr.org/2017/1119
- J. Li, Z. Lin, J. Caballero, Y. Zhang, and D. Gu, “K-hunt: Pinpointing insecure cryptographic keys from execution traces,” in Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’18. New York, NY, USA: Association for Computing Machinery, 2018, p. 412–425. [Online]. Available: https://doi.org/10.1145/3243734.3243783
- P. Kochberger and F. Seitl, “Detecting cryptography through ir visualization,” in 2018 International Conference on Software Security and Assurance (ICSSA), 2018, pp. 25–29.
- R. Zhao, D. Gu, J. Li, and Y. Zhang, “Automatic detection and analysis of encrypted messages in malware,” in Information Security and Cryptology, D. Lin, S. Xu, and M. Yung, Eds. Cham: Springer International Publishing, 2014, pp. 101–117.
- J. Li, L. Jiang, and H. Shu, “Binary code level cyclic feature recognition of cryptographic algorithm,” Computer engineering and design, vol. 35, no. 8, pp. 2628–2632, 2014.
- P. Lestringant, F. Guihéry, and P.-A. Fouque, “Automated identification of cryptographic primitives in binary code with data flow graph isomorphism,” in Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security, ser. ASIA CCS ’15. New York, NY, USA: Association for Computing Machinery, 2015, p. 203–214. [Online]. Available: https://doi.org/10.1145/2714576.2714639
- J. Calvet, J. M. Fernandez, and J.-Y. Marion, “Aligot: Cryptographic function identification in obfuscated binary programs,” in Proceedings of the 2012 ACM Conference on Computer and Communications Security, ser. CCS ’12. New York, NY, USA: Association for Computing Machinery, 2012, p. 169–182. [Online]. Available: https://doi.org/10.1145/2382196.2382217
- D. Xu, J. Ming, and D. Wu, “Cryptographic function detection in obfuscated binaries via bit-precise symbolic loop mapping,” in 2017 IEEE Symposium on Security and Privacy (SP), 2017, pp. 921–937.
- C. Meijer, V. Moonsamy, and J. Wetzels, “Where’s crypto?: Automated identification and classification of proprietary cryptographic primitives in binary code,” in 30th USENIX Security Symposium, USENIX Security 2021, August 11-13, 2021, M. Bailey and R. Greenstadt, Eds. USENIX Association, 2021, pp. 555–572. [Online]. Available: https://www.usenix.org/conference/usenixsecurity21/presentation/meijer
- “CVE-2014-0160.” Available from MITRE, CVE-ID CVE-2014-0160., Dec. 3 2013. [Online]. Available: http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-0160
- Y. Wang, W. Wang, S. Joty, and S. C. Hoi, “CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, M.-F. Moens, X. Huang, L. Specia, and S. W.-t. Yih, Eds. Online and Punta Cana, Dominican Republic: Association for Computational Linguistics, Nov. 2021, pp. 8696–8708. [Online]. Available: https://aclanthology.org/2021.emnlp-main.685
- G. Hill and X. Bellekens, “Cryptoknight: Generating and modelling compiled cryptographic primitives,” Information, vol. 9, no. 9, 2018. [Online]. Available: https://www.mdpi.com/2078-2489/9/9/231
- X. Li, Y. Chang, G. Ye, X. Gong, and Z. Tang, “Genda: A graph embedded network based detection approach on encryption algorithm of binary program,” J. Inf. Secur. Appl., vol. 65, no. C, mar 2022. [Online]. Available: https://doi.org/10.1016/j.jisa.2021.103088
- I. Guilfanov, “Findcrypt2,” https://hex-rays.com/blog/findcrypt2/, 2006.
- C. Zhao, F. Kang, J. Yang, and H. Shu, “A review of cryptographic algorithm recognition technology for binary code,” Journal of Physics: Conference Series, vol. 1856, no. 1, p. 012015, apr 2021. [Online]. Available: https://dx.doi.org/10.1088/1742-6596/1856/1/012015
- OpenSSL, “Openssl,” https://github.com/openssl/openssl, 2023.
- X. Jin, J. Larson, W. Yang, and Z. Lin, “Binary code summarization: Benchmarking chatgpt/gpt-4 and other large language models,” 2023.
- L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P. Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller, M. Simens, A. Askell, P. Welinder, P. Christiano, J. Leike, and R. Lowe, “Training language models to follow instructions with human feedback,” 2022.
- Y. Wang, H. Le, A. D. Gotmare, N. D. Q. Bui, J. Li, and S. C. H. Hoi, “Codet5+: Open code large language models for code understanding and generation,” 2023.
- E. Nijkamp, B. Pang, H. Hayashi, L. Tu, H. Wang, Y. Zhou, S. Savarese, and C. Xiong, “Codegen: An open large language model for code with multi-turn program synthesis,” ICLR, 2023.
- H. Husain, H. Wu, T. Gazit, M. Allamanis, and M. Brockschmidt, “Codesearchnet challenge: Evaluating the state of semantic code search,” CoRR, vol. abs/1909.09436, 2019. [Online]. Available: http://arxiv.org/abs/1909.09436
- Y. Li, D. Choi, J. Chung, N. Kushman, J. Schrittwieser, R. Leblond, T. Eccles, J. Keeling, F. Gimeno, A. Dal Lago et al., “Competition-level code generation with alphacode,” Science, vol. 378, no. 6624, pp. 1092–1097, 2022.
- A. Marcelli, M. Graziano, X. Ugarte-Pedrero, Y. Fratantonio, M. Mansouri, and D. Balzarotti, “How machine learning is solving the binary function similarity problem,” in 31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 2099–2116.
- J. Caballero, P. Poosankam, C. Kreibich, and D. Song, “Dispatcher: Enabling active botnet infiltration using automatic protocol reverse-engineering,” in Proceedings of the 16th ACM conference on Computer and communications security, 2009, pp. 621–634.
- M. Henderson, R. Al-Rfou, B. Strope, Y.-H. Sung, L. Lukács, R. Guo, S. Kumar, B. Miklos, and R. Kurzweil, “Efficient natural language response suggestion for smart reply,” arXiv preprint arXiv:1705.00652, 2017.
- A. Z. Broder, “On the resemblance and containment of documents,” in Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No. 97TB100171). IEEE, 1997, pp. 21–29.
- D. Kim, E. Kim, S. K. Cha, S. Son, and Y. Kim, “Revisiting binary code similarity analysis using interpretable feature engineering and lessons learned,” IEEE Transactions on Software Engineering, pp. 1–23, 2022.
- Facebook, Inc., “PyTorch,” https://pytorch.org, 2023.
- T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. L. Scao, S. Gugger, M. Drame, Q. Lhoest, and A. M. Rush, “Transformers: State-of-the-art natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Online: Association for Computational Linguistics, Oct. 2020, pp. 38–45. [Online]. Available: https://www.aclweb.org/anthology/2020.emnlp-demos.6
- microsoft, “Deepspeed,” https://github.com/microsoft/DeepSpeed/, 2023.
- C.-Y. Lin, “ROUGE: A package for automatic evaluation of summaries,” in Text Summarization Branches Out. Barcelona, Spain: Association for Computational Linguistics, Jul. 2004, pp. 74–81. [Online]. Available: https://www.aclweb.org/anthology/W04-1013
- K. Papineni, S. Roukos, T. Ward, and W. jing Zhu, “Bleu: a method for automatic evaluation of machine translation,” 2002, pp. 311–318.
- S. Banerjee and A. Lavie, “METEOR: An automatic metric for MT evaluation with improved correlation with human judgments,” in Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, J. Goldstein, A. Lavie, C.-Y. Lin, and C. Voss, Eds. Ann Arbor, Michigan: Association for Computational Linguistics, Jun. 2005, pp. 65–72. [Online]. Available: https://aclanthology.org/W05-0909
- A. Q. Jiang, A. Sablayrolles, A. Roux, A. Mensch, B. Savary, C. Bamford, D. S. Chaplot, D. de las Casas, E. B. Hanna, F. Bressand, G. Lengyel, G. Bour, G. Lample, L. R. Lavaud, L. Saulnier, M.-A. Lachaux, P. Stock, S. Subramanian, S. Yang, S. Antoniak, T. L. Scao, T. Gervet, T. Lavril, T. Wang, T. Lacroix, and W. E. Sayed, “Mixtral of experts,” 2024.
- N. Shalev and N. Partush, “Binary similarity detection using machine learning,” in Proceedings of the 13th Workshop on Programming Languages and Analysis for Security, ser. PLAS ’18. New York, NY, USA: Association for Computing Machinery, 2018, p. 42–47. [Online]. Available: https://doi.org/10.1145/3264820.3264821
- X. Xu, C. Liu, Q. Feng, H. Yin, L. Song, and D. Song, “Neural network-based graph embedding for cross-platform binary code similarity detection,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, ser. CCS ’17. New York, NY, USA: Association for Computing Machinery, 2017, p. 363–376. [Online]. Available: https://doi.org/10.1145/3133956.3134018
- S. H. H. Ding, B. C. M. Fung, and P. Charland, “Asm2vec: Boosting static representation robustness for binary clone search against code obfuscation and compiler optimization,” in 2019 IEEE Symposium on Security and Privacy (SP), 2019, pp. 472–489.
- Y. Li, C. Gu, T. Dullien, O. Vinyals, and P. Kohli, “Graph matching networks for learning the similarity of graph structured objects,” in International conference on machine learning. PMLR, 2019, pp. 3835–3845.