Knowledge-Informed Auto-Penetration Testing Based on Reinforcement Learning with Reward Machine (2405.15908v1)
Abstract: Automated penetration testing (AutoPT) based on reinforcement learning (RL) has proven its ability to improve the efficiency of vulnerability identification in information systems. However, RL-based PT encounters several challenges, including poor sampling efficiency, intricate reward specification, and limited interpretability. To address these issues, we propose a knowledge-informed AutoPT framework called DRLRM-PT, which leverages reward machines (RMs) to encode domain knowledge as guidelines for training a PT policy. In our study, we specifically focus on lateral movement as a PT case study and formulate it as a partially observable Markov decision process (POMDP) guided by RMs. We design two RMs based on the MITRE ATT&CK knowledge base for lateral movement. To solve the POMDP and optimize the PT policy, we employ the deep Q-learning algorithm with RM (DQRM). The experimental results demonstrate that the DQRM agent exhibits higher training efficiency in PT compared to agents without knowledge embedding. Moreover, RMs encoding more detailed domain knowledge demonstrated better PT performance compared to RMs with simpler knowledge.
- Patrick Engebretson. The basics of hacking and penetration testing: ethical hacking and penetration testing made easy. Elsevier, 2013.
- Penetration testing: Concepts, attack methods, and defense strategies. In 2016 IEEE Long Island Systems, Applications and Technology Conference (LISAT), pages 1–6. IEEE, 2016.
- Reinforcement learning for efficient network penetration testing. Information, 11(1):6, 2019.
- David Maynor. Metasploit toolkit for penetration testing, exploit development, and vulnerability research. Elsevier, 2011.
- Automated penetration testing, a systematic review. In 2023 International Mobile, Intelligent, and Ubiquitous Computing Conference (MIUCC), pages 373–380, 2023.
- Deep multiagent reinforcement learning: Challenges and directions. Artificial Intelligence Review, 56(6):5023–5056, 2023.
- A survey of deep reinforcement learning in video games. arXiv preprint arXiv:1912.10944, 2019.
- Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, 23(6):4909–4926, 2021.
- Innes: An intelligent network penetration testing model based on deep reinforcement learning. Applied Intelligence, 53(22):27110–27127, 2023.
- Autonomous penetration testing using reinforcement learning. arXiv preprint arXiv:1905.05965, 2019.
- Automating post-exploitation with deep reinforcement learning. Computers & Security, 100:102108, 2021.
- Powershell-empire.
- Reinforcement learning for intelligent penetration testing. In 2018 Second World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4), pages 185–192. IEEE, 2018.
- Automated penetration testing using deep reinforcement learning. In 2020 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), pages 2–10. IEEE, 2020.
- Deep hierarchical reinforcement agents for automated penetration testing. arXiv preprint arXiv:2109.06449, 2021.
- Deep reinforcement learning for penetration testing of cyber-physical attacks in the smart grid. In 2022 International Joint Conference on Neural Networks (IJCNN), pages 01–09. IEEE, 2022.
- Mitre att&ck.
- Cyber kill chain.
- Reinforcement learning: An introduction. MIT press, 2018.
- Deep reinforcement learning for cyber security. arXiv preprint arXiv:1906.05799, 2019.
- Using reward machines for high-level task specification and decomposition in reinforcement learning. In International Conference on Machine Learning, pages 2107–2116. PMLR, 2018.
- Learning reward machines: A study in partially observable reinforcement learning. arXiv e-prints, pages arXiv–2112, 2021.
- Nmap in the enterprise: your guide to network scanning. Elsevier, 2011.
- Reward machines: Exploiting reward function structure in reinforcement learning. Journal of Artificial Intelligence Research, 73:173–208, 2022.
- Microsoft Defender Research Team. Cyberbattlesim, 2021.