Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL (2305.17342v3)

Published 27 May 2023 in cs.LG and cs.AI

Abstract: Most existing works focus on direct perturbations to the victim's state/action or the underlying transition dynamics to demonstrate the vulnerability of reinforcement learning agents to adversarial attacks. However, such direct manipulations may not be always realizable. In this paper, we consider a multi-agent setting where a well-trained victim agent $\nu$ is exploited by an attacker controlling another agent $\alpha$ with an \textit{adversarial policy}. Previous models do not account for the possibility that the attacker may only have partial control over $\alpha$ or that the attack may produce easily detectable "abnormal" behaviors. Furthermore, there is a lack of provably efficient defenses against these adversarial policies. To address these limitations, we introduce a generalized attack framework that has the flexibility to model to what extent the adversary is able to control the agent, and allows the attacker to regulate the state distribution shift and produce stealthier adversarial policies. Moreover, we offer a provably efficient defense with polynomial convergence to the most robust victim policy through adversarial training with timescale separation. This stands in sharp contrast to supervised learning, where adversarial training typically provides only \textit{empirical} defenses. Using the Robosumo competition experiments, we show that our generalized attack formulation results in much stealthier adversarial policies when maintaining the same winning rate as baselines. Additionally, our adversarial training approach yields stable learning dynamics and less exploitable victim policies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (78)
  1. On the theory of policy gradient methods: Optimality, approximation, and distribution shift. J. Mach. Learn. Res., 22(98):1–76, 2021.
  2. Continuous adaptation via meta-learning in nonstationary and competitive environments. arXiv preprint arXiv:1710.03641, 2017.
  3. Open-ended learning in symmetric zero-sum games. In International Conference on Machine Learning, pp. 434–443. PMLR, 2019.
  4. Emergent complexity via multi-agent competition. arXiv preprint arXiv:1710.03748, 2017.
  5. Emergent complexity via multi-agent competition, 2018.
  6. The emergence of adversarial communication in multi-agent reinforcement learning, 2020.
  7. Deep counterfactual regret minimization. In International conference on machine learning, pp. 793–802. PMLR, 2019.
  8. Information-theoretic considerations in batch reinforcement learning. In International Conference on Machine Learning, pp. 1042–1051. PMLR, 2019.
  9. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning, pp. 1310–1320. PMLR, 2019.
  10. Independent policy gradient methods for competitive reinforcement learning. Advances in neural information processing systems, 33:5527–5540, 2020.
  11. The complexity of constrained min-max optimization. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pp.  1466–1478, 2021.
  12. Local convergence analysis of gradient descent ascent with finite timescale separation. In Proceedings of the International Conference on Learning Representation, 2021.
  13. Online robustness training for deep reinforcement learning. arXiv preprint arXiv:1911.00887, 2019.
  14. Illusionary attacks on sequential decision makers and countermeasures. arXiv preprint arXiv:2207.10170, 2022.
  15. Consistency and cautious fictitious play. Journal of Economic Dynamics and Control, 19(5-7):1065–1089, 1995.
  16. Oracles & followers: Stackelberg equilibria in deep multi-agent reinforcement learning. In International Conference on Machine Learning, pp. 11213–11236. PMLR, 2023.
  17. Adversarial policies: Attacking deep reinforcement learning. arXiv preprint arXiv:1905.10615, 2019.
  18. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  19. Explaining and harnessing adversarial examples. In Yoshua Bengio and Yann LeCun (eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6572.
  20. On the effectiveness of interval bound propagation for training verifiably robust models. arXiv preprint arXiv:1810.12715, 2018.
  21. Scalable verified training for provably robust image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  4842–4851, 2019.
  22. Adversarial policy learning in two-player competitive games. In International Conference on Machine Learning, pp. 3910–3919. PMLR, 2021.
  23. Jamie Hayes. Extensions and limitations of randomized smoothing for robustness guarantees. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp.  786–787, 2020.
  24. Deep reinforcement learning from self-play in imperfect-information games. arXiv preprint arXiv:1603.01121, 2016.
  25. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems, 30, 2017.
  26. Adversarial attacks on neural network policies, 2017.
  27. What is local optimality in nonconvex-nonconcave minimax optimization? In International conference on machine learning, pp. 4880–4889. PMLR, 2020.
  28. Approximately optimal approximate reinforcement learning. In In Proc. 19th International Conference on Machine Learning. Citeseer, 2002.
  29. Harold W Kuhn. A simplified two-person poker. Contributions to the Theory of Games, 1:97–103, 1950.
  30. Policy smoothing for provably robust reinforcement learning. arXiv preprint arXiv:2106.11420, 2021.
  31. A unified game-theoretic approach to multiagent reinforcement learning. Advances in neural information processing systems, 30, 2017.
  32. Openspiel: A framework for reinforcement learning in games. arXiv preprint arXiv:1908.09453, 2019.
  33. Efficient adversarial training without attacking: Worst-case-aware robust reinforcement learning. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=y-E1htoQl-n.
  34. On the robustness of cooperative multi-agent reinforcement learning. In 2020 IEEE Security and Privacy Workshops (SPW), pp. 62–68. IEEE, 2020.
  35. Tactics of adversarial attack on deep reinforcement learning agents, 2019.
  36. Michael L Littman. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994, pp.  157–163. Elsevier, 1994.
  37. Towards unifying behavioral and response diversity for open-ended learning in zero-sum games. Advances in Neural Information Processing Systems, 34:941–952, 2021.
  38. Computing approximate equilibria in sequential adversarial games by exploitability descent. arXiv preprint arXiv:1903.05614, 2019.
  39. Certified adversarial robustness for deep reinforcement learning. In Conference on Robot Learning, pp.  1328–1337. PMLR, 2020.
  40. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  41. Pipeline psro: A scalable approach for finding approximate nash equilibria in large games. arXiv preprint arXiv:2006.08555, 2020.
  42. Xdo: A double oracle algorithm for extensive-form games. Advances in Neural Information Processing Systems, 34:23128–23139, 2021.
  43. Gaussian process based message filtering for robust multi-agent cooperation in the presence of adversarial communication, 2020.
  44. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
  45. Asynchronous methods for deep reinforcement learning. In International conference on machine learning, pp. 1928–1937. PMLR, 2016.
  46. A generalized training approach for multiagent learning. arXiv preprint arXiv:1909.12823, 2019.
  47. Rémi Munos. Error bounds for approximate policy iteration. In ICML, volume 3, pp.  560–567. Citeseer, 2003.
  48. Robust deep reinforcement learning through adversarial loss, 2020.
  49. Robust deep reinforcement learning with adversarial attacks, 2017.
  50. Modelling behavioural diversity for learning in open-ended games. In International Conference on Machine Learning, pp. 8514–8524. PMLR, 2021.
  51. From poincaré recurrence to convergence in imperfect information games: Finding equilibrium via regularization. In International Conference on Machine Learning, pp. 8525–8535. PMLR, 2021.
  52. Robust adversarial reinforcement learning. In International Conference on Machine Learning, pp. 2817–2826. PMLR, 2017.
  53. Understanding adversarial attacks on observations in deep reinforcement learning, 2021.
  54. Certified defenses against adversarial examples. arXiv preprint arXiv:1801.09344, 2018a.
  55. Semidefinite relaxations for certifying robustness to adversarial examples. Advances in Neural Information Processing Systems, 31, 2018b.
  56. Balancing detectability and performance of attacks on the control channel of markov decision processes. arXiv preprint arXiv:2109.07171, 2021.
  57. Trust region policy optimization. In International conference on machine learning, pp. 1889–1897. PMLR, 2015.
  58. Adversarial training for free! Advances in Neural Information Processing Systems, 32, 2019.
  59. Mastering the game of go without human knowledge. nature, 550(7676):354–359, 2017.
  60. A unified approach to reinforcement learning, quantal response equilibria, and two-player zero-sum games. arXiv preprint arXiv:2206.05825, 2022.
  61. Actor-critic policy optimization in partially observable multiagent environments. Advances in neural information processing systems, 31, 2018.
  62. Stealthy and efficient adversarial attacks against deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp.  5883–5891, 2020.
  63. Who is the strongest enemy? towards optimal and efficient evasion attacks in deep rl. arXiv preprint arXiv:2106.05087, 2021.
  64. Certifiably robust policy learning against adversarial communication in multi-agent systems. arXiv preprint arXiv:2206.10158, 2022.
  65. Intriguing properties of neural networks. In Yoshua Bengio and Yann LeCun (eds.), 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014. URL http://arxiv.org/abs/1312.6199.
  66. Action robust reinforcement learning and applications in continuous control. In International Conference on Machine Learning, pp. 6215–6224. PMLR, 2019.
  67. Adversarial attacks on multi-agent communication. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.  7768–7777, October 2021.
  68. Provable defenses against adversarial examples via the convex outer adversarial polytope. In International Conference on Machine Learning, pp. 5286–5295. PMLR, 2018.
  69. Crop: Certifying robust policies for reinforcement learning through functional smoothing. arXiv preprint arXiv:2106.09292, 2021a.
  70. Adversarial policy training against deep reinforcement learning. In 30th {normal-{\{{USENIX}normal-}\}} Security Symposium ({normal-{\{{USENIX}normal-}\}} Security 21), 2021b.
  71. Mis-spoke or mis-lead: Achieving robustness in multi-agent communicative reinforcement learning. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’22, pp.  1418–1426, Richland, SC, 2022. International Foundation for Autonomous Agents and Multiagent Systems. ISBN 9781450392136.
  72. Regularized gradient descent ascent for two-player zero-sum markov games. Advances in Neural Information Processing Systems, 35:34546–34558, 2022.
  73. Efficient neural network robustness certification with general activation functions. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, pp.  4944–4953, Red Hook, NY, USA, 2018a. Curran Associates Inc.
  74. Towards stable and efficient training of verifiably robust neural networks. In International Conference on Learning Representations, 2020a. URL https://openreview.net/forum?id=Skxuk1rFwB.
  75. Robust deep reinforcement learning against adversarial perturbations on observations. 2020b.
  76. Robust reinforcement learning on state observations with learned optimal adversary. In International Conference on Learning Representations, 2021a. URL https://openreview.net/forum?id=sCZbhBvqQaU.
  77. Fully decentralized multi-agent reinforcement learning with networked agents. In International Conference on Machine Learning, pp. 5872–5881. PMLR, 2018b.
  78. Gradient play in stochastic games: stationary points, convergence, and sample complexity. arXiv preprint arXiv:2106.00198, 2021b.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Xiangyu Liu (47 papers)
  2. Souradip Chakraborty (36 papers)
  3. Yanchao Sun (32 papers)
  4. Furong Huang (150 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.