Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 170 tok/s
Gemini 2.5 Pro 50 tok/s Pro
GPT-5 Medium 30 tok/s Pro
GPT-5 High 41 tok/s Pro
GPT-4o 60 tok/s Pro
Kimi K2 208 tok/s Pro
GPT OSS 120B 440 tok/s Pro
Claude Sonnet 4.5 35 tok/s Pro
2000 character limit reached

Adaptive Discounting of Training Time Attacks (2401.02652v1)

Published 5 Jan 2024 in cs.LG, cs.AI, and cs.CR

Abstract: Among the most insidious attacks on Reinforcement Learning (RL) solutions are training-time attacks (TTAs) that create loopholes and backdoors in the learned behaviour. Not limited to a simple disruption, constructive TTAs (C-TTAs) are now available, where the attacker forces a specific, target behaviour upon a training RL agent (victim). However, even state-of-the-art C-TTAs focus on target behaviours that could be naturally adopted by the victim if not for a particular feature of the environment dynamics, which C-TTAs exploit. In this work, we show that a C-TTA is possible even when the target behaviour is un-adoptable due to both environment dynamics as well as non-optimality with respect to the victim objective(s). To find efficient attacks in this context, we develop a specialised flavour of the DDPG algorithm, which we term gammaDDPG, that learns this stronger version of C-TTA. gammaDDPG dynamically alters the attack policy planning horizon based on the victim's current behaviour. This improves effort distribution throughout the attack timeline and reduces the effect of uncertainty the attacker has about the victim. To demonstrate the features of our method and better relate the results to prior research, we borrow a 3D grid domain from a state-of-the-art C-TTA for our experiments. Code is available at "bit.ly/github-rb-gDDPG".

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Defense against reward poisoning attacks in reinforcement learning. arXiv preprint arXiv:2102.05776, 2021.
  2. Poisoning the well: Can we simultaneously attack a group of learning agents? In Edith Elkind, editor, Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pages 3470–3478, 8 2023.
  3. Vulnerability of deep reinforcement learning to policy induction attacks. In International Conference on Machine Learning and Data Mining in Pattern Recognition (MLDM), pages 262–275, 2017.
  4. Reinforcement learning in newcomblike environments. Advances in Neural Information Processing Systems, 34:22146–22157, 2021.
  5. Performative prediction in a stateful world. In International Conference on Artificial Intelligence and Statistics, pages 6045–6061, 2022.
  6. Adversarial attack and defense in reinforcement learning-from ai security view. Cybersecurity, 2(1):1–22, 2019.
  7. A survey on reinforcement learning security with application to autonomous driving. arXiv preprint arXiv:2212.06123, 2022.
  8. Emergent complexity and zero-shot transfer via unsupervised environment design. In Advances in neural information processing systems, pages 13049–13061, 2020.
  9. How to discount deep reinforcement learning: Towards new dynamic strategies. arXiv preprint arXiv:1512.02011, 2015.
  10. Proximal policy optimization with policy feedback. IEEE Trans. SMC: Systems, 2021.
  11. Intelligent autonomous intersection management. arXiv preprint arXiv:2202.04224, 2022.
  12. An improvement for value-based reinforcement learning method through increasing discount factor substitution. In Int. Conf. on CSE, pages 94–100, 2021.
  13. Malicious attacks against deep reinforcement learning interpretations. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 472–482, 2020.
  14. Adversarial attacks on neural network policies. arXiv preprint arXiv:1702.02284, 2017.
  15. Challenges and countermeasures for adversarial attacks on deep reinforcement learning. IEEE Transactions on Artificial Intelligence, 3(2):90–109, 2021.
  16. Replay-guided adversarial environment design. In Advances in Neural Information Processing Systems, pages 1884–1897, 2021.
  17. Prioritized level replay. In International Conference on Machine Learning, pages 4940–4950, 2021.
  18. Adaptive discount factor for deep reinforcement learning in continuing tasks with uncertainty. Sensors, 22(19):7266, 2022.
  19. Delving into adversarial attacks on deep policies. arXiv preprint arXiv:1705.06452, 2017.
  20. On information and sufficiency. The annals of mathematical statistics, 22(1):79–86, 1951.
  21. Continuous control with deep reinforcement learning. In Yoshua Bengio and Yann LeCun, editors, 4th International Conference on Learning Representations, ICLR 2016, 2016.
  22. Tactics of adversarial attack on deep reinforcement learning agents. arXiv preprint arXiv:1703.06748, 2017.
  23. Adversarial cheap talk. arXiv preprint arXiv:2211.11030, 2022.
  24. Certified adversarial robustness for deep reinforcement learning. In Conference on Robot Learning, pages 1328–1337, 2020.
  25. Performative reinforcement learning. arXiv preprint arXiv:2207.00046, 2022.
  26. Teacher–student curriculum learning. 31(9):3732–3740, 2019.
  27. Stochastic optimization for performative prediction. Advances in Neural Information Processing Systems, 33:4929–4939, 2020.
  28. Outside the echo chamber: Optimizing the performative risk. In International Conference on Machine Learning, pages 7710–7720, 2021.
  29. Attention in delay of gratification. Journal of Personality and Social Psychology, 16(2):329, 1970.
  30. The role of serotonin in the regulation of patience and impulsivity. Molecular Neurobiology, 45(2):213–224, 2012.
  31. Robust reinforcement learning. 17(2):335–359, 2005.
  32. Multiplayer performative prediction: Learning in decision-dependent games. arXiv preprint arXiv:2201.03398, 2022.
  33. Evolving curricula with regret-based environment design. arXiv preprint arXiv:2203.01302, 2022.
  34. Performative prediction. In International Conference on Machine Learning, pages 7599–7609, 2020.
  35. Cultivating desired behaviour: Policy teaching via environment-dynamics tweaks. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 1097–1104, 2010.
  36. Policy teaching via environment poisoning: Training-time adversarial attacks against reinforcement learning. In International Conference on Machine Learning (ICML), pages 7974–7984. PMLR, 2020.
  37. Reward poisoning in reinforcement learning: Attacks against unknown learners in unknown environments. arXiv preprint arXiv:2102.08492, 2021.
  38. A survey of multi-objective sequential decision-making. JAIR, 48:67–113, 2013.
  39. Trust region policy optimization. In International conference on machine learning, pages 1889–1897. PMLR, 2015.
  40. Transition based discount factor for model free algorithms in reinforcement learning. Symmetry, 13(7):1197, 2021.
  41. Radar: Reactive and deliberative adaptive reasoning-learning when to think fast and when to think slow. In ICDL, pages 184–189, 2022.
  42. Stealthy and efficient adversarial attacks against deep reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 5883–5891, 2020.
  43. Vulnerability-aware poisoning mechanism for online RL with unknown dynamics. arXiv preprint arXiv:2009.00774, 2020.
  44. Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), pages 23–30, 2017.
  45. Leonid Nisonovich Vaserstein. Markov processes over denumerable products of spaces, describing large systems of automata. Problemy Peredachi Informatsii, 5(3):64–72, 1969.
  46. Markov decision processes with state-dependent discount factors and unbounded rewards/costs. Operations Research Letters, 39(5):369–374, 2011.
  47. Martha White. Unifying task specification in reinforcement learning. In ICML, pages 3742–3750, 2017.
  48. Spiking pitch black: Poisoning an unknown environment to attack unknown reinforcement learners. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS), pages 1409–1417, 2022.
  49. Transferable environment poisoning: Training-time attack on reinforcement learning. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS), pages 1398–1406, 2021.
  50. Reinforcement learning with state-dependent discount factor. In ICDL, pages 1–6, 2013.
  51. Robust reinforcement learning on state observations with learned optimal adversary. arXiv preprint arXiv:2101.08452, 2021.
  52. Adaptive reward-poisoning attacks against reinforcement learning. In International Conference on Machine Learning, pages 11225–11234, 2020.
  53. State-wise adaptive discounting from experience (sade): A novel discounting scheme for reinforcement learning (student abstract). In AAAI, pages 15953–15954, 2021.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: