BinGo: Identifying Security Patches in Binary Code with Graph Representation Learning (2312.07921v1)

Published 13 Dec 2023 in cs.CR and cs.SE

Abstract: A timely software update is vital to combat the increasing security vulnerabilities. However, some software vendors may secretly patch their vulnerabilities without creating CVE entries or even describing the security issue in their change log. Thus, it is critical to identify these hidden security patches and defeat potential N-day attacks. Researchers have employed various machine learning techniques to identify security patches in open-source software, leveraging the syntax and semantic features of the software changes and commit messages. However, all these solutions cannot be directly applied to the binary code, whose instructions and program flow may dramatically vary due to different compilation configurations. In this paper, we propose BinGo, a new security patch detection system for binary code. The main idea is to present the binary code as code property graphs to enable a comprehensive understanding of program flow and perform a LLM over each basic block of binary code to catch the instruction semantics. BinGo consists of four phases, namely, patch data pre-processing, graph extraction, embedding generation, and graph representation learning. Due to the lack of an existing binary security patch dataset, we construct such a dataset by compiling the pre-patch and post-patch source code of the Linux kernel. Our experimental results show BinGo can achieve up to 80.77% accuracy in identifying security patches between two neighboring versions of binary code. Moreover, BinGo can effectively reduce the false positives and false negatives caused by the different compilers and optimization levels.

References (57)

Authors (7)

Xu He (66 papers)
Shu Wang (176 papers)
Pengbin Feng (7 papers)
Xinda Wang (9 papers)
Shiyu Sun (4 papers)
Qi Li (354 papers)
Kun Sun (51 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

BinGo: Identifying Security Patches in Binary Code with Graph Representation Learning (2312.07921v1)

Summary

Related Papers