Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dynamic Neural Control Flow Execution: An Agent-Based Deep Equilibrium Approach for Binary Vulnerability Detection (2404.08562v1)

Published 3 Apr 2024 in cs.CR, cs.AI, and cs.LG

Abstract: Software vulnerabilities are a challenge in cybersecurity. Manual security patches are often difficult and slow to be deployed, while new vulnerabilities are created. Binary code vulnerability detection is less studied and more complex compared to source code, and this has important practical implications. Deep learning has become an efficient and powerful tool in the security domain, where it provides end-to-end and accurate prediction. Modern deep learning approaches learn the program semantics through sequence and graph neural networks, using various intermediate representation of programs, such as abstract syntax trees (AST) or control flow graphs (CFG). Due to the complex nature of program execution, the output of an execution depends on the many program states and inputs. Also, a CFG generated from static analysis can be an overestimation of the true program flow. Moreover, the size of programs often does not allow a graph neural network with fixed layers to aggregate global information. To address these issues, we propose DeepEXE, an agent-based implicit neural network that mimics the execution path of a program. We use reinforcement learning to enhance the branching decision at every program state transition and create a dynamic environment to learn the dependency between a vulnerability and certain program states. An implicitly defined neural network enables nearly infinite state transitions until convergence, which captures the structural information at a higher level. The experiments are conducted on two semi-synthetic and two real-world datasets. We show that DeepEXE is an accurate and efficient method and outperforms the state-of-the-art vulnerability detection methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Marwan Ali Albahar. 2020. A Modified Maximal Divergence Sequential Auto-Encoder and Time Delay Neural Network Models for Vulnerable Binary Codes Detection. IEEE Access 8 (2020), 14999–15006.
  2. The tip of the iceberg: On the merits of finding security bugs. ACM Transactions on Privacy and Security (TOPS) 24, 1 (2020), 1–33.
  3. Towards Learning Representations of Binary Executable Files for Security Tasks. arXiv preprint arXiv:2002.03388 (2020).
  4. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
  5. Deep equilibrium models. Advances in Neural Information Processing Systems 32 (2019).
  6. A survey of symbolic execution techniques. ACM Computing Surveys (CSUR) 51, 3 (2018), 1–39.
  7. Abraham Berman and Robert J Plemmons. 1994. Nonnegative matrices in the mathematical sciences. SIAM.
  8. Bgnn4vd: constructing bidirectional graph neural-network for vulnerability detection. Information and Software Technology 136 (2021), 106576.
  9. BinGo: cross-architecture cross-OS binary search. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering.
  10. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
  11. Learning steady-states of iterative algorithms over graphs. In International conference on machine learning. PMLR, 1106–1114.
  12. Statistical similarity of binaries. Acm Sigplan Notices 51, 6 (2016), 266–280.
  13. Yaniv David and Eran Yahav. 2014. Tracelet-based code search in executables. Acm Sigplan Notices 49, 6 (2014), 349–360.
  14. End-to-end differentiable physics for learning and control. Advances in neural information processing systems 31 (2018).
  15. Deepbindiff: Learning program-wide code representations for binary diffing. In Network and Distributed System Security Symposium.
  16. Implicit deep learning. SIAM Journal on Mathematics of Data Science 3, 3 (2021), 930–958.
  17. Vulcon: A system for vulnerability prioritization, mitigation, and management. ACM Transactions on Privacy and Security (TOPS) 21, 4 (2018), 1–28.
  18. Claudio Gallicchio and Alessio Micheli. 2010. Graph echo state networks. In The 2010 international joint conference on neural networks (IJCNN). IEEE, 1–8.
  19. A new model for learning in graph domains. In Proceedings. 2005 IEEE international joint conference on neural networks, Vol. 2. 729–734.
  20. Implicit graph neural networks. Advances in Neural Information Processing Systems 33 (2020), 11984–11995.
  21. Extracting rules for vulnerabilities detection with static metrics using machine learning. International Journal of System Assurance Engineering and Management 12, 1 (2021), 65–76.
  22. Inductive representation learning on large graphs. arXiv preprint arXiv:1706.02216 (2017).
  23. Automated software vulnerability detection with machine learning. arXiv preprint arXiv:1803.04497 (2018).
  24. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
  25. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).
  26. James C King. 1976. Symbolic execution and program testing. Commun. ACM 19, 7 (1976), 385–394.
  27. Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
  28. Taku Kudo. 2018. Subword regularization: Improving neural network translation models with multiple subword candidates. arXiv preprint arXiv:1804.10959 (2018).
  29. Taku Kudo and John Richardson. 2018. Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. arXiv preprint arXiv:1808.06226 (2018).
  30. Maximal divergence sequential autoencoder for binary software vulnerability detection. In International Conference on Learning Representations.
  31. Instruction2vec: Efficient Preprocessor of Assembly Code to Detect Software Weakness with CNN. Applied Sciences 9, 19 (2019), 4086.
  32. Learning binary code with deep learning to detect software weakness. In KSII The 9th International Conference on Internet (ICONI) 2017 Symposium.
  33. Gated graph sequence neural networks. arXiv preprint arXiv:1511.05493 (2015).
  34. Vuldeelocator: a deep learning-based fine-grained vulnerability detector. IEEE Transactions on Dependable and Secure Computing (2021).
  35. VulDeePecker: A deep learning-based system for vulnerability detection. arXiv preprint arXiv:1801.01681 (2018).
  36. α𝛼\alphaitalic_αdiff: cross-version binary code similarity detection with dnn. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 667–678.
  37. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
  38. Gated graph recurrent neural networks. IEEE Transactions on Signal Processing 68 (2020), 6303–6318.
  39. Predicting vulnerability inducing function versions using node embeddings and graph neural networks. Information and Software Technology 145 (2022), 106822.
  40. The graph neural network model. IEEE transactions on neural networks 20, 1 (2008), 61–80.
  41. Using software metrics for predicting vulnerable classes and methods in Java projects: A machine learning approach. Journal of Software: Evolution and Process 33, 3 (2021), e2303.
  42. BinDeep: A deep learning approach to binary code similarity detection. Expert Systems with Applications 168 (2021), 114348.
  43. BVDetector: A program slice-based binary code vulnerability intelligent detection system. Information and Software Technology 123 (2020), 106289.
  44. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
  45. Homer F Walker and Peng Ni. 2011. Anderson acceleration for fixed-point iterations. SIAM J. Numer. Anal. 49, 4 (2011), 1715–1735.
  46. Ronald J Williams. 1992. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning 8, 3 (1992), 229–256.
  47. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4–24.
  48. Fitness-guided path exploration in dynamic symbolic execution. In 2009 IEEE/IFIP International Conference on Dependable Systems & Networks. IEEE, 359–368.
  49. Neural network-based graph embedding for cross-platform binary code similarity detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 363–376.
  50. HAN-BSVD: a hierarchical attention network for binary software vulnerability detection. Computers & Security 108 (2021), 102286.
  51. Order matters: semantic-aware neural networks for binary code similarity detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 1145–1152.
  52. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In Advances in Neural Information Processing Systems. 10197–10207.
  53. μ𝜇\muitalic_μVulDeePecker: A Deep Learning-Based System for Multiclass Vulnerability Detection. IEEE Transactions on Dependable and Secure Computing 18, 5 (2019), 2224–2236.
  54. Neural machine translation inspired binary code similarity comparison beyond function pairs. arXiv preprint arXiv:1808.04706 (2018).

Summary

We haven't generated a summary for this paper yet.