Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adversarial Attacks on Reinforcement Learning Agents for Command and Control (2405.01693v2)

Published 2 May 2024 in cs.CR

Abstract: Given the recent impact of Deep Reinforcement Learning in training agents to win complex games like StarCraft and DoTA(Defense Of The Ancients) - there has been a surge in research for exploiting learning based techniques for professional wargaming, battlefield simulation and modeling. Real time strategy games and simulators have become a valuable resource for operational planning and military research. However, recent work has shown that such learning based approaches are highly susceptible to adversarial perturbations. In this paper, we investigate the robustness of an agent trained for a Command and Control task in an environment that is controlled by an active adversary. The C2 agent is trained on custom StarCraft II maps using the state of the art RL algorithms - A3C and PPO. We empirically show that an agent trained using these algorithms is highly susceptible to noise injected by the adversary and investigate the effects these perturbations have on the performance of the trained agent. Our work highlights the urgent need to develop more robust training algorithms especially for critical arenas like the battlefield.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Blizzard, “Starcraft ii.” [Online]. Available: https://starcraft2.blizzard.com
  2. Valve, “Dota 2.” [Online]. Available: https://www.dota2.com/home
  3. O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka, A. Huang, L. Sifre, T. Cai, J. P. Agapiou, M. Jaderberg, A. S. Vezhnevets, R. Leblond, T. Pohlen, V. Dalibard, D. Budden, Y. Sulsky, J. Molloy, T. L. Paine, C. Gulcehre, Z. Wang, T. Pfaff, Y. Wu, R. Ring, D. Yogatama, D. Wünsch, K. McKinney, O. Smith, T. Schaul, T. Lillicrap, K. Kavukcuoglu, D. Hassabis, C. Apps, and D. Silver, “Grandmaster level in StarCraft II using multi-agent reinforcement learning,” Nature, 2019.
  4. OpenAI, :, C. Berner, G. Brockman, B. Chan, V. Cheung, P. Dębiak, C. Dennison, D. Farhi, Q. Fischer, S. Hashme, C. Hesse, R. Józefowicz, S. Gray, C. Olsson, J. Pachocki, M. Petrov, H. P. d. O. Pinto, J. Raiman, T. Salimans, J. Schlatter, J. Schneider, S. Sidor, I. Sutskever, J. Tang, F. Wolski, and S. Zhang, “Dota 2 with large scale deep reinforcement learning,” 2019.
  5. google deepmind, “pysc2.” [Online]. Available: https://github.com/google-deepmind/pysc2
  6. M. Samvelyan, T. Rashid, C. S. de Witt, G. Farquhar, N. Nardelli, T. G. J. Rudner, C.-M. Hung, P. H. S. Torr, J. Foerster, and S. Whiteson, “The starcraft multi-agent challenge,” 2019.
  7. oxwhirl, “smacv2.” [Online]. Available: https://github.com/oxwhirl/smacv2
  8. pydota2, “pydota2.” [Online]. Available: https://github.com/pydota2/pydota2
  9. P. Narayanan, M. Vindiola, S. Park, A. Logie, N. Waytowich, M. Mittrick, J. Richardson, D. Asher, and A. Kott, “First-year report of arl directors strategic initiative (fy20-23): artificial intelligence (ai) for command and control (c2) of multi-domain operations (mdo),” US Army Combat Capabilities Development Command, Army Research Laboratory, 2021.
  10. S. J. Park, M. M. Vindiola, A. C. Logie, P. Narayanan, and J. Davies, “Deep reinforcement learning to assist command and control,” in Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV.   SPIE, 2022.
  11. S. Soleyman and D. Khosla, “Multi-agent mission planning with reinforcement learning,” in AAAI Symposium on the 2nd Workshop on Deep Models and Artificial Intelligence for Defense Applications: Potentials, Theories, Practices, Tools, and Risks.   AAAI, 2020.
  12. L. Zhang, J. Xu, D. Gold, J. Hagen, A. K. Kochhar, A. J. Lohn, and O. A. Osoba, “Air dominance through machine learning,” Santa Monica, CA: RAND Corporation, 2020.
  13. A. Basak, E. G. Zaroukian, K. Corder, R. Fernandez, C. D. Hsu, P. K. Sharma, N. R. Waytowich, and D. E. Asher, “Utility of doctrine with multi-agent rl for military engagements,” in Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV.   SPIE, 2022.
  14. N. Waytowich, J. Hare, V. G. Goecks, M. Mittrick, J. Richardson, A. Basak, and D. E. Asher, “Learning to guide multiple heterogeneous actors from a single human demonstration via automatic curriculum learning in starcraft ii,” in Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV.   SPIE, 2022.
  15. O. Vinyals, T. Ewalds, S. Bartunov, P. Georgiev, A. S. Vezhnevets, M. Yeo, A. Makhzani, H. Küttler, J. Agapiou, J. Schrittwieser et al., “Starcraft ii: A new challenge for reinforcement learning,” arXiv preprint arXiv:1708.04782, 2017.
  16. V. G. Goecks, N. Waytowich, D. E. Asher, S. Jun Park, M. Mittrick, J. Richardson, M. Vindiola, A. Logie, M. Dennison, T. Trout et al., “On games and simulators as a platform for development of artificial intelligence for command and control,” The Journal of Defense Modeling and Simulation, vol. 20, no. 4, 2023.
  17. E. Liang, R. Liaw, R. Nishihara, P. Moritz, R. Fox, K. Goldberg, J. Gonzalez, M. Jordan, and I. Stoica, “Rllib: Abstractions for distributed reinforcement learning,” in International conference on machine learning.   PMLR, 2018.
  18. V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in International conference on machine learning.   PMLR, 2016.
  19. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  20. J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in International conference on machine learning.   PMLR, 2015.
  21. S. Huang, N. Papernot, I. Goodfellow, Y. Duan, and P. Abbeel, “Adversarial attacks on neural network policies,” arXiv preprint arXiv:1702.02284, 2017.
  22. J. Sun, T. Zhang, X. Xie, L. Ma, Y. Zheng, K. Chen, and Y. Liu, “Stealthy and efficient adversarial attacks against deep reinforcement learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2020.
  23. X. Wu, W. Guo, H. Wei, and X. Xing, “Adversarial policy training against deep reinforcement learning,” in 30th USENIX Security Symposium (USENIX Security 21), 2021.
  24. A. Gleave, M. Dennis, C. Wild, N. Kant, S. Levine, and S. Russell, “Adversarial policies: Attacking deep reinforcement learning,” arXiv preprint arXiv:1905.10615, 2019.
  25. N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in 2017 IEEE Symposium on Security and Privacy (SP), 2017.
  26. I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
  27. A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/forum?id=rJzIBfZAb
Citations (1)

Summary

We haven't generated a summary for this paper yet.