Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TREC: APT Tactic / Technique Recognition via Few-Shot Provenance Subgraph Learning (2402.15147v2)

Published 23 Feb 2024 in cs.CR and cs.LG

Abstract: APT (Advanced Persistent Threat) with the characteristics of persistence, stealth, and diversity is one of the greatest threats against cyber-infrastructure. As a countermeasure, existing studies leverage provenance graphs to capture the complex relations between system entities in a host for effective APT detection. In addition to detecting single attack events as most existing work does, understanding the tactics / techniques (e.g., Kill-Chain, ATT&CK) applied to organize and accomplish the APT attack campaign is more important for security operations. Existing studies try to manually design a set of rules to map low-level system events to high-level APT tactics / techniques. However, the rule based methods are coarse-grained and lack generalization ability, thus they can only recognize APT tactics and cannot identify fine-grained APT techniques and mutant APT attacks. In this paper, we propose TREC, the first attempt to recognize APT tactics / techniques from provenance graphs by exploiting deep learning techniques. To address the "needle in a haystack" problem, TREC segments small and compact subgraphs covering individual APT technique instances from a large provenance graph based on a malicious node detection model and a subgraph sampling algorithm. To address the "training sample scarcity" problem, TREC trains the APT tactic / technique recognition model in a few-shot learning manner by adopting a Siamese neural network. We evaluate TREC based on a customized dataset collected and made public by our team. The experiment results show that TREC significantly outperforms state-of-the-art systems in APT tactic recognition and TREC can also effectively identify APT techniques.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. A. Alshamrani, S. Myneni, A. Chowdhary, and D. Huang, “A survey on advanced persistent threats: Techniques, solutions, challenges, and research opportunities,” IEEE Communications Surveys & Tutorials, vol. 21, no. 2, pp. 1851–1877, 2019.
  2. Z. Li, Q. A. Chen, R. Yang, Y. Chen, and W. Ruan, “Threat detection and investigation with system-level provenance graphs: a survey,” Computers & Security, vol. 106, p. 102282, 2021.
  3. M. Zipperle, F. Gottwalt, E. Chang, and T. Dillon, “Provenance-based intrusion detection systems: A survey,” ACM Computing Surveys, vol. 55, no. 7, pp. 1–36, 2022.
  4. T. Zhu, J. Wang, L. Ruan, C. Xiong, J. Yu, Y. Li, Y. Chen, M. Lv, and T. Chen, “General, efficient, and real-time data compaction strategy for apt forensic analysis,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 3312–3325, 2021.
  5. T. Yadav and A. M. Rao, “Technical aspects of cyber kill chain,” in Security in Computing and Communications: Third International Symposium, SSCC 2015, Kochi, India, August 10-13, 2015. Proceedings 3.   Springer, 2015, pp. 438–452.
  6. M. ATT&CK, “Mitre att&ck,” https://attack.mitre.org/, 2021.
  7. M. N. Hossain, S. M. Milajerdi, J. Wang, B. Eshete, R. Gjomemo, R. Sekar, S. Stoller, and V. Venkatakrishnan, “{{\{{SLEUTH}}\}}: Real-time attack scenario reconstruction from {{\{{COTS}}\}} audit data,” in 26th USENIX Security Symposium (USENIX Security 17), 2017, pp. 487–504.
  8. S. M. Milajerdi, R. Gjomemo, B. Eshete, R. Sekar, and V. Venkatakrishnan, “Holmes: real-time apt detection through correlation of suspicious information flows,” in 2019 IEEE Symposium on Security and Privacy (SP).   IEEE, 2019, pp. 1137–1152.
  9. C. Xiong, T. Zhu, W. Dong, L. Ruan, R. Yang, Y. Cheng, Y. Chen, S. Cheng, and X. Chen, “Conan: A practical real-time apt detection system with high accuracy and efficiency,” IEEE Transactions on Dependable and Secure Computing, vol. 19, no. 1, pp. 551–565, 2020.
  10. M. Barre, A. Gehani, and V. Yegneswaran, “Mining data provenance to detect advanced persistent threats,” in 11th International Workshop on Theory and Practice of Provenance (TaPP 2019), 2019.
  11. A. Bates, D. J. Tian, K. R. Butler, and T. Moyer, “Trustworthy {{\{{Whole-System}}\}} provenance for the linux kernel,” in 24th USENIX Security Symposium (USENIX Security 15), 2015, pp. 319–334.
  12. A. Gehani and D. Tariq, “Spade: Support for provenance auditing in distributed environments,” in ACM/IFIP/USENIX International Conference on Distributed Systems Platforms and Open Distributed Processing.   Springer, 2012, pp. 101–120.
  13. S. Wang, Z. Wang, T. Zhou, H. Sun, X. Yin, D. Han, H. Zhang, X. Shi, and J. Yang, “Threatrace: Detecting and tracing host-based threats in node level through provenance graph learning,” IEEE Transactions on Information Forensics and Security, vol. 17, pp. 3972–3987, 2022.
  14. T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks.” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
  15. P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” CoRR, vol. abs/1710.10903, 2017.
  16. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019.   OpenReview.net, 2019.
  17. F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008 eighth ieee international conference on data mining.   IEEE, 2008, pp. 413–422.
  18. X. Wang, H. Ji, C. Shi, B. Wang, Y. Ye, P. Cui, and P. S. Yu, “Heterogeneous graph attention network,” in The World Wide Web Conference, WWW 2019, San Francisco, CA, USA, May 13-17, 2019, L. Liu, R. W. White, A. Mantrach, F. Silvestri, J. J. McAuley, R. Baeza-Yates, and L. Zia, Eds.   ACM, 2019, pp. 2022–2032.
  19. Y. Dong, N. V. Chawla, and A. Swami, “metapath2vec: Scalable representation learning for heterogeneous networks,” in Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, 2017, pp. 135–144.
  20. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  21. D. Chicco, “Siamese neural networks: An overview,” Artificial neural networks, pp. 73–94, 2021.
  22. S. Harikumar and P. Surya, “K-medoid clustering for heterogeneous datasets,” Procedia Computer Science, vol. 70, pp. 226–237, 2015.
  23. T. Chen, C. Dong, M. Lv, Q. Song, H. Liu, T. Zhu, K. Xu, L. Chen, S. Ji, and Y. Fan, “Apt-kgl: An intelligent apt detection system based on threat knowledge and heterogeneous provenance graph learning,” IEEE Transactions on Dependable and Secure Computing, 2022.
  24. W. U. Hassan, S. Guo, D. Li, Z. Chen, K. Jee, Z. Li, and A. Bates, “Nodoze: Combatting threat alert fatigue with automated provenance triage,” in network and distributed systems security symposium, 2019.
  25. P. Fang, P. Gao, C. Liu, E. Ayday, K. Jee, T. Wang, Y. F. Ye, Z. Liu, and X. Xiao, “{{\{{Back-Propagating}}\}} system dependency impact for attack investigation,” in 31st USENIX Security Symposium (USENIX Security 22), 2022, pp. 2461–2478.
  26. Y. Liu, M. Zhang, D. Li, K. Jee, Z. Li, Z. Wu, J. Rhee, and P. Mittal, “Towards a timely causality analysis for enterprise security.” in NDSS, 2018.
  27. M. N. Hossain, S. Sheikhi, and R. Sekar, “Combating dependence explosion in forensic analysis using alternative tag propagation semantics,” in 2020 IEEE Symposium on Security and Privacy (SP).   IEEE, 2020, pp. 1139–1155.
  28. J. Zengy, X. Wang, J. Liu, Y. Chen, Z. Liang, T.-S. Chua, and Z. L. Chua, “Shadewatcher: Recommendation-guided cyber threat analysis using system audit records,” in 2022 IEEE Symposium on Security and Privacy (SP).   IEEE, 2022, pp. 489–506.
  29. Q. Wang, W. U. Hassan, D. Li, K. Jee, X. Yu, K. Zou, J. Rhee, Z. Chen, W. Cheng, C. A. Gunter et al., “You are what you do: Hunting stealthy malware via data provenance analysis.” in NDSS, 2020.
  30. A. Alsaheel, Y. Nan, S. Ma, L. Yu, G. Walkup, Z. B. Celik, X. Zhang, and D. Xu, “{{\{{ATLAS}}\}}: A sequence-based learning approach for attack investigation,” in 30th USENIX security symposium (USENIX security 21), 2021, pp. 3005–3022.
  31. X. Han, T. F. J. Pasquier, A. Bates, J. Mickens, and M. I. Seltzer, “Unicorn: Runtime provenance-based detector for advanced persistent threats,” in 27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, California, USA, February 23-26, 2020.   The Internet Society, 2020.
  32. M. Kapoor, J. Melton, M. Ridenhour, S. Krishnan, and T. Moyer, “Prov-gem: Automated provenance analysis framework using graph embeddings,” in 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA).   IEEE, 2021, pp. 1720–1727.
  33. F. Yang, J. Xu, C. Xiong, Z. Li, and K. Zhang, “{{\{{PROGRAPHER}}\}}: An anomaly detection system based on provenance graph embedding,” in 32nd USENIX Security Symposium (USENIX Security 23), 2023, pp. 4355–4372.
  34. Z. Xu, P. Fang, C. Liu, X. Xiao, Y. Wen, and D. Meng, “Depcomm: Graph summarization on system audit logs for attack investigation,” in 2022 IEEE Symposium on Security and Privacy (SP).   IEEE, 2022, pp. 540–557.
  35. W. U. Hassan, A. Bates, and D. Marino, “Tactical provenance analysis for endpoint detection and response systems,” in 2020 IEEE Symposium on Security and Privacy (SP).   IEEE, 2020, pp. 1172–1189.
  36. T. Zhu, J. Yu, C. Xiong, W. Cheng, Q. Yuan, J. Ying, T. Chen, J. Zhang, M. Lv, Y. Chen et al., “Aptshield: A stable, efficient and real-time apt detection system for linux hosts,” IEEE Transactions on Dependable and Secure Computing, 2023.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com