Robustness Testing for Multi-Agent Reinforcement Learning: State Perturbations on Critical Agents (2306.06136v1)
Abstract: Multi-Agent Reinforcement Learning (MARL) has been widely applied in many fields such as smart traffic and unmanned aerial vehicles. However, most MARL algorithms are vulnerable to adversarial perturbations on agent states. Robustness testing for a trained model is an essential step for confirming the trustworthiness of the model against unexpected perturbations. This work proposes a novel Robustness Testing framework for MARL that attacks states of Critical Agents (RTCA). The RTCA has two innovations: 1) a Differential Evolution (DE) based method to select critical agents as victims and to advise the worst-case joint actions on them; and 2) a team cooperation policy evaluation method employed as the objective function for the optimization of DE. Then, adversarial state perturbations of the critical agents are generated based on the worst-case joint actions. This is the first robustness testing framework with varying victim agents. RTCA demonstrates outstanding performance in terms of the number of victim agents and destroying cooperation policies.
- T. Wu, P. Zhou, K. Liu, Y. Yuan, X. Wang, H. Huang, and D. O. Wu, “Multi-agent deep reinforcement learning for urban traffic light control in vehicular networks,” IEEE Transactions on Vehicular Technology, vol. 69, no. 8, pp. 8243–8256, 2020.
- H. Shi, G. Liu, K. Zhang, Z. Zhou, and J. Wang, “Marl sim2real transfer: Merging physical reality with digital virtuality in metaverse,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 53, no. 4, pp. 2107–2117, 2023.
- L. Wang, K. Wang, C. Pan, W. Xu, N. Aslam, and L. Hanzo, “Multi-agent deep reinforcement learning-based trajectory planning for multi-uav assisted mobile edge computing,” IEEE Transactions on Cognitive Communications and Networking, vol. 7, no. 1, pp. 73–84, 2021.
- P. Sunehag, G. Lever, A. Gruslys, W. M. Czarnecki, V. Zambaldi, M. Jaderberg, M. Lanctot, N. Sonnerat, J. Z. Leibo, K. Tuyls, and T. Graepel, “Value-decomposition networks for cooperative multi-agent learning based on team reward,” in Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS ’18, (Richland, SC), p. 2085–2087, International Foundation for Autonomous Agents and Multiagent Systems, 2018.
- T. Rashid, M. Samvelyan, C. S. De Witt, G. Farquhar, J. Foerster, and S. Whiteson, “Monotonic value function factorisation for deep multi-agent reinforcement learning,” J. Mach. Learn. Res., vol. 21, jan 2020.
- K. Son, D. Kim, W. J. Kang, D. E. Hostallero, and Y. Yi, “QTRAN: Learning to factorize with transformation for cooperative multi-agent reinforcement learning,” in Proceedings of the 36th International Conference on Machine Learning (K. Chaudhuri and R. Salakhutdinov, eds.), vol. 97 of Proceedings of Machine Learning Research, pp. 5887–5896, PMLR, 09–15 Jun 2019.
- J. Guo, Y. Chen, Y. Hao, Z. Yin, Y. Yu, and S. Li, “Towards comprehensive testing on the robustness of cooperative multi-agent reinforcement learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, pp. 115–122, June 2022.
- S. He, S. Han, S. Su, S. Han, S. Zou, and F. Miao, “Robust multi-agent reinforcement learning with state uncertainties,” 2023.
- J. Lin, K. Dzeparoska, S. Q. Zhang, A. Leon-Garcia, and N. Papernot, “On the robustness of cooperative multi-agent reinforcement learning,” in 2020 IEEE Security and Privacy Workshops (SPW), pp. 62–68, 2020.
- Z. Zhou and G. Liu, “Romfac: A robust mean-field actor-critic reinforcement learning against adversarial perturbations on states,” arXiv preprint arXiv:2205.07229, 2022.
- S. Huang, N. Papernot, I. Goodfellow, Y. Duan, and P. Abbeel, “Adversarial attacks on neural network policies,” arXiv preprint arXiv:1702.02284, 2017.
- H. Zhang, H. Chen, C. Xiao, B. Li, M. Liu, D. Boning, and C.-J. Hsieh, “Robust deep reinforcement learning against adversarial perturbations on state observations,” in Advances in Neural Information Processing Systems (H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, eds.), vol. 33, pp. 21024–21037, Curran Associates, Inc., 2020.
- H. Zhang, H. Chen, D. S. Boning, and C.-J. Hsieh, “Robust reinforcement learning on state observations with learned optimal adversary,” in International Conference on Learning Representations, 2021.
- S. Han, S. Su, S. He, S. Han, H. Yang, and F. Miao, “What is the solution for state adversarial multi-agent reinforcement learning?,” arXiv preprint arXiv:2212.02705, 2022.
- S. Li, J. Guo, J. Xiu, P. Feng, X. Yu, J. Wang, A. Liu, W. Wu, and X. Liu, “Attacking cooperative multi-agent reinforcement learning by adversarial minority influence,” arXiv preprint arXiv:2302.03322, 2023.
- F. A. Oliehoek and C. Amato, A concise introduction to decentralized POMDPs. Springer, 2016.
- D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein, “The complexity of decentralized control of markov decision processes,” Mathematics of operations research, vol. 27, no. 4, pp. 819–840, 2002.
- R. Storn and K. Price, “Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces,” J. of Global Optimization, vol. 11, p. 341–359, dec 1997.
- Bilal, M. Pant, H. Zaheer, L. Garcia-Hernandez, and A. Abraham, “Differential evolution: A review of more than two decades of research,” Engineering Applications of Artificial Intelligence, vol. 90, p. 103479, 2020.
- K. R. Opara and J. Arabas, “Differential evolution: A survey of theoretical analyses,” Swarm and Evolutionary Computation, vol. 44, pp. 546–558, 2019.
- J. Su, D. V. Vargas, and K. Sakurai, “One pixel attack for fooling deep neural networks,” IEEE Transactions on Evolutionary Computation, vol. 23, no. 5, pp. 828–841, 2019.
- On-line Q-learning using connectionist systems, vol. 37. 1994.
- I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
- M. Li, C. Deng, T. Li, J. Yan, X. Gao, and H. Huang, “Towards transferable targeted attack,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020.
- A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in 6th International Conference on Learning Representations, ICLR 2018, Conference Track Proceedings, (Vancouver, BC, Canada), OpenReview.net, 30 Apr – 3 May 2018.
- M. Samvelyan, T. Rashid, C. Schroeder de Witt, G. Farquhar, N. Nardelli, T. G. J. Rudner, C.-M. Hung, P. H. S. Torr, J. Foerster, and S. Whiteson, “The starcraft multi-agent challenge,” in Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS ’19, (Richland, SC), p. 2186–2188, International Foundation for Autonomous Agents and Multiagent Systems, 2019.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
- C. Yu, A. Velu, E. Vinitsky, J. Gao, Y. Wang, A. Bayen, and Y. WU, “The surprising effectiveness of ppo in cooperative multi-agent games,” in Advances in Neural Information Processing Systems (S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, eds.), vol. 35, pp. 24611–24624, Curran Associates, Inc., 2022.
- Y. Sun, R. Zheng, Y. Liang, and F. Huang, “Who is the strongest enemy? towards optimal and efficient evasion attacks in deep RL,” in International Conference on Learning Representations, 2022.
- I. Ilahi, M. Usama, J. Qadir, M. U. Janjua, A. Al-Fuqaha, D. T. Hoang, and D. Niyato, “Challenges and countermeasures for adversarial attacks on deep reinforcement learning,” IEEE Transactions on Artificial Intelligence, vol. 3, no. 2, pp. 90–109, 2022.
- A. Pattanaik, Z. Tang, S. Liu, G. Bommannan, and G. Chowdhary, “Robust deep reinforcement learning with adversarial attacks,” in Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS ’18, (Richland, SC), p. 2040–2042, International Foundation for Autonomous Agents and Multiagent Systems, 2018.
- T.-W. Weng, K. D. Dvijotham*, J. Uesato*, K. Xiao*, S. Gowal*, R. Stanforth*, and P. Kohli, “Toward evaluating robustness of deep reinforcement learning with continuous control,” in International Conference on Learning Representations, 2020.
- N. H. Pham, L. M. Nguyen, J. Chen, H. T. Lam, S. Das, and T.-W. Weng, “Evaluating robustness of cooperative marl: A model-based approach,” arXiv preprint arXiv:2202.03558, 2022.
- R. Lowe, Y. WU, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” in Advances in Neural Information Processing Systems (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), vol. 30, Curran Associates, Inc., 2017.
- S. Iqbal and F. Sha, “Actor-attention-critic for multi-agent reinforcement learning,” in Proceedings of the 36th International Conference on Machine Learning (K. Chaudhuri and R. Salakhutdinov, eds.), vol. 97 of Proceedings of Machine Learning Research, pp. 2961–2970, PMLR, 09–15 Jun 2019.
- Ziyuan Zhou (6 papers)
- Guanjun Liu (13 papers)