Regret-Based Defense in Adversarial Reinforcement Learning (2302.06912v4)
Abstract: Deep Reinforcement Learning (DRL) policies have been shown to be vulnerable to small adversarial noise in observations. Such adversarial noise can have disastrous consequences in safety-critical environments. For instance, a self-driving car receiving adversarially perturbed sensory observations about nearby signs (e.g., a stop sign physically altered to be perceived as a speed limit sign) or objects (e.g., cars altered to be recognized as trees) can be fatal. Existing approaches for making RL algorithms robust to an observation-perturbing adversary have focused on reactive approaches that iteratively improve against adversarial examples generated at each iteration. While such approaches have been shown to provide improvements over regular RL methods, they are reactive and can fare significantly worse if certain categories of adversarial examples are not generated during training. To that end, we pursue a more proactive approach that relies on directly optimizing a well-studied robustness measure, regret instead of expected value. We provide a principled approach that minimizes maximum regret over a "neighborhood" of observations to the received "observation". Our regret criterion can be used to modify existing value- and policy-based Deep RL methods. We demonstrate that our approaches provide a significant improvement in performance across a wide variety of benchmarks against leading approaches for robust Deep RL.
- Regret based robust solutions for uncertain Markov decision processes.
- Maksym Andriushchenko and Nicolas Flammarion. 2020. Understanding and improving fast adversarial training. Advances in Neural Information Processing Systems 33 (2020), 16048–16059.
- Recent advances in adversarial training for adversarial robustness. arXiv preprint arXiv:2102.01356 (2021).
- A model-based reinforcement learning with adversarial training for online recommendation. Advances in Neural Information Processing Systems 32 (2019).
- The Arcade Learning Environment: An Evaluation Platform for General Agents. Journal of Artificial Intelligence Research 47 (jun 2013), 253–279. https://doi.org/10.1613/jair.3912
- A cognitive hierarchy model of games. The Quarterly Journal of Economics 119, 3 (2004), 861–898.
- Robust Physical Adversarial Attack on Faster R-CNN Object Detector. CoRR abs/1804.05810 (2018). arXiv:1804.05810 http://arxiv.org/abs/1804.05810
- Certified Adversarial Robustness for Deep Reinforcement Learning. CoRR abs/2004.06496 (2020). arXiv:2004.06496 https://arxiv.org/abs/2004.06496
- Domain-adversarial training of neural networks. The journal of machine learning research 17, 1 (2016), 2096–2030.
- Adversarial Policies: Attacking Deep Reinforcement Learning. https://doi.org/10.48550/ARXIV.1905.10615
- Explaining and Harnessing Adversarial Examples. https://doi.org/10.48550/ARXIV.1412.6572
- Adversarial Attacks on Neural Network Policies. https://doi.org/10.48550/ARXIV.1702.02284
- Regret minimization for partially observable deep reinforcement learning. , 2342–2351 pages.
- Robust reinforcement learning via adversarial training with langevin dynamics. Advances in Neural Information Processing Systems 33 (2020), 8127–8138.
- Transfer of adversarial robustness between perturbation types. arXiv preprint arXiv:1905.01034 (2019).
- Deep reinforcement learning for autonomous driving: A survey.
- Jernej Kos and Dawn Song. 2017. Delving into adversarial attacks on deep policies. https://doi.org/10.48550/ARXIV.1705.06452
- Edouard Leurent. 2018. An Environment for Autonomous Driving Decision-Making. https://github.com/eleurent/highway-env.
- Efficient adversarial training without attacking: Worst-case-aware robust reinforcement learning. Advances in Neural Information Processing Systems 35 (2022), 22547–22561.
- Tactics of Adversarial Attack on Deep Reinforcement Learning Agents. CoRR abs/1703.06748 (2017). arXiv:1703.06748 http://arxiv.org/abs/1703.06748
- Towards Deep Learning Models Resistant to Adversarial Attacks. https://doi.org/10.48550/ARXIV.1706.06083
- Playing Atari with Deep Reinforcement Learning. (2013). https://doi.org/10.48550/ARXIV.1312.5602
- Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
- Robust Deep Reinforcement Learning through Adversarial Loss. (2021). https://openreview.net/forum?id=eaAM_bdW0Q
- Robust deep reinforcement learning with adversarial attacks.
- Robust adversarial reinforcement learning. In International Conference on Machine Learning. PMLR, 2817–2826.
- EPOpt: Learning Robust Neural Network Policies Using Model Ensembles. arXiv:1610.01283 [cs.LG]
- Minimax Regret Optimisation for Robust Planning in Uncertain Markov Decision Processes. , 11930-11938 pages. https://doi.org/10.1609/aaai.v35i13.17417
- Proximal Policy Optimization Algorithms. https://doi.org/10.48550/ARXIV.1707.06347
- Adversarial training for free! Advances in Neural Information Processing Systems 32 (2019).
- Universal adversarial training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 5636–5643.
- Toward self-driving processes: A deep reinforcement learning approach to control. , e16689 pages.
- Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning. , 5883-5891 pages. https://doi.org/10.1609/aaai.v34i04.6047
- Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL. arXiv:2106.05087 [cs.LG]
- Robustifying reinforcement learning agents via action space adversarial training. In 2020 American control conference (ACC). IEEE, 3959–3964.
- MuJoCo: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 5026–5033. https://doi.org/10.1109/IROS.2012.6386109
- Fast is better than free: Revisiting adversarial training. arXiv preprint arXiv:2001.03994 (2020).
- Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations. https://doi.org/10.48550/ARXIV.2003.08938
- Roman Belaire (2 papers)
- Pradeep Varakantham (50 papers)
- Thanh Nguyen (70 papers)
- David Lo (229 papers)