Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond Fidelity: Explaining Vulnerability Localization of Learning-based Detectors (2401.02686v2)

Published 5 Jan 2024 in cs.CR, cs.LG, and cs.SE

Abstract: Vulnerability detectors based on deep learning (DL) models have proven their effectiveness in recent years. However, the shroud of opacity surrounding the decision-making process of these detectors makes it difficult for security analysts to comprehend. To address this, various explanation approaches have been proposed to explain the predictions by highlighting important features, which have been demonstrated effective in other domains such as computer vision and natural language processing. Unfortunately, an in-depth evaluation of vulnerability-critical features, such as fine-grained vulnerability-related code lines, learned and understood by these explanation approaches remains lacking. In this study, we first evaluate the performance of ten explanation approaches for vulnerability detectors based on graph and sequence representations, measured by two quantitative metrics including fidelity and vulnerability line coverage rate. Our results show that fidelity alone is not sufficient for evaluating these approaches, as fidelity incurs significant fluctuations across different datasets and detectors. We subsequently check the precision of the vulnerability-related code lines reported by the explanation approaches, and find poor accuracy in this task among all of them. This can be attributed to the inefficiency of explainers in selecting important features and the presence of irrelevant artifacts learned by DL-based detectors.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (78)
  1. Coverity, 2020. https://scan.coverity.com/.
  2. RATS, 2014. https://code.google.com/archive/p/rough-auditing-tool-for-security/.
  3. Infer, 2020. https://fbinfer.com/.
  4. Clang static analyzer, 2020. https://clang-analyzer.llvm.org/scan-build.html.
  5. HP Fortify, 2020. https://www.hpfod.com/.
  6. Checkmarx, 2020. https://www.checkmarx.com/.
  7. SVF: Interprocedural static value-flow analysis in LLVM. In CC ’16, pages 265–266, 2016.
  8. Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks. In Hanna M. Wallach, Hugo Larochelle, Alina Beygelzimer, Florence d’Alché-Buc, Emily B. Fox, and Roman Garnett, editors, Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 10197–10207, 2019.
  9. Deep learning based vulnerability detection: Are we there yet. IEEE Transactions on Software Engineering, (01), 2021.
  10. Vulnerability detection with fine-grained interpretations. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 292–303, 2021.
  11. Deepwukong: Statically detecting software vulnerabilities using deep graph neural network. ACM Trans. Softw. Eng. Methodol., 30(3), 2021.
  12. Sysevr: A framework for using deep learning to detect software vulnerabilities. IEEE Transactions on Dependable and Secure Computing, pages 1–1, 2021.
  13. Vuldeepecker: A deep learning-based system for vulnerability detection. The Network and Distributed System Security Symposium (NDSS), 2018.
  14. Vulcnn: An image-inspired scalable vulnerability detection system. 2022.
  15. Automated vulnerability detection in source code using deep representation learning. In IEEE International Conference on Machine Learning & Applications, 2018.
  16. Gnnexplainer: Generating explanations for graph neural networks. Advances in neural information processing systems, 32, 2019.
  17. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In 2018 IEEE winter conference on applications of computer vision (WACV), pages 839–847. IEEE, 2018.
  18. Interpreting deep learning-based vulnerability detector predictions based on heuristic searching. ACM Transactions on Software Engineering and Methodology, 30(2), 2020.
  19. Attribution-based xai methods in computer vision: A review. arXiv preprint arXiv:2211.14736, 2022.
  20. Post-hoc interpretability for neural nlp: A survey. ACM Computing Surveys, 55(8):1–42, 2022.
  21. Parameterized explainer for graph neural network. Advances in neural information processing systems, 33:19620–19631, 2020.
  22. Higher-order explanations of graph neural networks via relevant walks. arXiv preprint arXiv:2006.03589, 2020.
  23. Explainability methods for graph convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10772–10781, 2019.
  24. Learning important features through propagating activation differences. In International conference on machine learning, pages 3145–3153. PMLR, 2017.
  25. Explaining recurrent neural network predictions in sentiment analysis. In Proceedings of the 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, pages 159–168, Copenhagen, Denmark, September 2017. Association for Computational Linguistics.
  26. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
  27. ” why should i trust you?” explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016.
  28. Not just a black box: Learning important features through propagating activation differences. arXiv preprint arXiv:1605.01713, 2016.
  29. Path-sensitive code embedding via contrastive learning for software vulnerability detection. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 519–531, 2022.
  30. Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Transactions on Information Forensics and Security, 16:1943–1958, 2020.
  31. Linevd: statement-level vulnerability detection using graph neural networks. In Proceedings of the 19th International Conference on Mining Software Repositories, pages 596–607, 2022.
  32. Linevul: a transformer-based line-level vulnerability prediction. In Proceedings of the 19th International Conference on Mining Software Repositories, pages 608–620, 2022.
  33. Vu1spg: Vulnerability detection based on slice property graph representation learning. In 2021 IEEE 32nd International Symposium on Software Reliability Engineering (ISSRE), pages 457–467. IEEE, 2021.
  34. Vuldeelocator: a deep learning-based fine-grained vulnerability detector. IEEE Transactions on Dependable and Secure Computing, 19(4):2821–2837, 2021.
  35. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems, 26, 2013.
  36. Gated graph sequence neural networks. Computer Science, 2015.
  37. Modeling and discovering vulnerabilities with code property graphs. In 2014 IEEE Symposium on Security and Privacy, pages 590–604. IEEE, 2014.
  38. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pages 1532–1543, 2014.
  39. On the properties of neural machine translation: Encoder-decoder approaches. Computer Science, 2014.
  40. T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. 2016.
  41. Distributed representations of sentences and documents. In International conference on machine learning, pages 1188–1196. PMLR, 2014.
  42. A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks, 18(5–6):602–610, 2005.
  43. Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001.
  44. How about bug-triggering paths?-understanding and characterizing learning-based vulnerability detectors. IEEE Transactions on Dependable and Secure Computing, 2022.
  45. Improved semantic representations from tree-structured long short-term memory networks. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, pages 1556–1566. The Association for Computer Linguistics, 2015.
  46. Explainability in graph neural networks: A taxonomic survey. arXiv preprint arXiv:2012.15445, 2020.
  47. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626, 2017.
  48. Definitions, methods, and applications in interpretable machine learning. Proceedings of the National Academy of Sciences, 116(44):22071–22080, 2019.
  49. Software Assurance Reference Dataset, 2017. https://samate.nist.gov/SARD/index.php.
  50. A c/c++ code vulnerability dataset with code changes and cve summaries. In MSR ’20: 17th International Conference on Mining Software Repositories, 2020.
  51. Perturbation-based explanations of prediction models. In Human and machine learning, pages 159–175. Springer, 2018.
  52. D2a: A dataset built for ai-based vulnerability detection methods using differential analysis. In Proceedings of the ACM/IEEE 43rd International Conference on Software Engineering: Software Engineering in Practice, ICSE-SEIP ’21, New York, NY, USA, 2021. Association for Computing Machinery.
  53. Data quality for software vulnerability datasets, 2023.
  54. Understanding and tackling label errors in deep learning-based vulnerability detection (experience paper). In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 52–63, 2023.
  55. Joern. https://github.com/joernio/joern.
  56. Explaining graph neural networks for vulnerability discovery. In Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security, pages 145–156, 2021.
  57. Codebert: A pre-trained model for programming and natural languages. In Trevor Cohn, Yulan He, and Yang Liu, editors, Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pages 1536–1547. Association for Computational Linguistics, 2020.
  58. Learning program semantics with code representations: An empirical study. In Proceedings of the 29th IEEE International Conference onSoftware Analysis, Evolution and Reengineering, 2022.
  59. When gpt meets program analysis: Towards intelligent detection of smart contract logic vulnerabilities in gptscan. arXiv preprint arXiv:2308.03314, 2023.
  60. Malwukong: Towards fast, accurate, and multilingual detection of malicious code poisoning in oss supply chains. In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 1993–2005. IEEE, 2023.
  61. Feature visualization. Distill, 2(11):e7, 2017.
  62. Xgnn: Towards model-level explanations of graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 430–438, 2020.
  63. Graphlime: Local interpretable model explanations for graph neural networks. IEEE Transactions on Knowledge and Data Engineering, 2022.
  64. Pgm-explainer: Probabilistic graphical model explanations for graph neural networks. Advances in neural information processing systems, 33:12225–12235, 2020.
  65. Relex: A model-agnostic relational model explainer. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pages 1042–1049, 2021.
  66. On explainability of graph neural networks via subgraph explorations. In International conference on machine learning, pages 12241–12252. PMLR, 2021.
  67. Hard masking for explaining graph neural networks. 2020.
  68. Causal screening to interpret graph neural networks. 2020.
  69. Interpreting graph neural networks for nlp with differentiable edge masking. arXiv preprint arXiv:2010.00577, 2020.
  70. The vulnerability is in the details: Locating fine-grained information of vulnerable code identified by graph-based detectors, 2023.
  71. Predicting defective lines using a model-agnostic technique. IEEE Transactions on Software Engineering, 48(5):1480–1496, 2020.
  72. Jitline: A simpler, better, faster, finer-grained just-in-time defect prediction. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), pages 369–379. IEEE, 2021.
  73. Just-in-time defect prediction technology based on interpretability technology. In 2021 8th International Conference on Dependable Systems and Their Applications (DSA), pages 78–89. IEEE, 2021.
  74. Pyexplainer: Explaining the predictions of just-in-time defect models. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 407–418. IEEE, 2021.
  75. An explainable deep model for defect prediction. In 2019 IEEE/ACM 7th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE), pages 49–55. IEEE, 2019.
  76. Understanding neural code intelligence through program simplification. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pages 441–452, 2021.
  77. Syntax-guided program reduction for understanding neural code intelligence models. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming, pages 70–79, 2022.
  78. Interpreters for gnn-based vulnerability detection: Are we there yet? In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 1407–1419, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Baijun Cheng (3 papers)
  2. Shengming Zhao (4 papers)
  3. Kailong Wang (41 papers)
  4. Meizhen Wang (3 papers)
  5. Guangdong Bai (29 papers)
  6. Ruitao Feng (14 papers)
  7. Yao Guo (70 papers)
  8. Lei Ma (195 papers)
  9. Haoyu Wang (309 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com