Local Environment Poisoning Attacks on Federated Reinforcement Learning (2303.02725v4)
Abstract: Federated learning (FL) has become a popular tool for solving traditional Reinforcement Learning (RL) tasks. The multi-agent structure addresses the major concern of data-hungry in traditional RL, while the federated mechanism protects the data privacy of individual agents. However, the federated mechanism also exposes the system to poisoning by malicious agents that can mislead the trained policy. Despite the advantage brought by FL, the vulnerability of Federated Reinforcement Learning (FRL) has not been well-studied before. In this work, we propose a general framework to characterize FRL poisoning as an optimization problem and design a poisoning protocol that can be applied to policy-based FRL. Our framework can also be extended to FRL with actor-critic as a local RL algorithm by training a pair of private and public critics. We provably show that our method can strictly hurt the global objective. We verify our poisoning effectiveness by conducting extensive experiments targeting mainstream RL algorithms and over various RL OpenAI Gym environments covering a wide range of difficulty levels. Within these experiments, we compare clean and baseline poisoning methods against our proposed framework. The results show that the proposed framework is successful in poisoning FRL systems and reducing performance across various environments and does so more effectively than baseline methods. Our work provides new insights into the vulnerability of FL in RL training and poses new challenges for designing robust FRL algorithms
- On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift. J. Mach. Learn. Res., 22(98): 1–76.
- Multi-task federated reinforcement learning with adversaries. arXiv preprint arXiv:2103.06473.
- How to backdoor federated learning. In International Conference on Artificial Intelligence and Statistics, 2938–2948. PMLR.
- Baird, L. 1995. Residual algorithms: Reinforcement learning with function approximation. In Machine Learning Proceedings 1995, 30–37. Elsevier.
- Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680.
- Analyzing federated learning through an adversarial lens. In International Conference on Machine Learning, 634–643. PMLR.
- Challenges of real-world reinforcement learning: definitions, benchmarks and analysis. Machine Learning, 110(9): 2419–2468.
- Fault-tolerant federated reinforcement learning with theoretical guarantee. Advances in Neural Information Processing Systems, 34.
- Local model poisoning attacks to {{\{{Byzantine-Robust}}\}} federated learning. In 29th USENIX Security Symposium (USENIX Security 20), 1605–1622.
- Federated Reinforcement Learning: Linear Speedup Under Markovian Sampling. In International Conference on Machine Learning, 10997–11057. PMLR.
- Kretchmar, R. M. 2002. Parallel reinforcement learning. In The 6th World Conference on Systemics, Cybernetics, and Informatics. Citeseer.
- Federated transfer reinforcement learning for autonomous driving. In Federated and Transfer Learning, 357–371. Springer.
- Lifelong federated reinforcement learning: a learning architecture for navigation in cloud robotic systems. IEEE Robotics and Automation Letters, 4(4): 4555–4562.
- Reinforcement learning for clinical decision support in critical care: comprehensive review. Journal of medical Internet research, 22(7): e18477.
- Asynchronous methods for deep reinforcement learning. In International conference on machine learning, 1928–1937. PMLR.
- TrojDRL: Trojan Attacks on Deep Reinforcement Learning Agents. In Proc. 57th ACM/IEEE Design Automation Conference (DAC), 2020, March 2020. In Proc. 57th ACM/IEEE Design Automation Conference (DAC), 2020.
- Natural actor-critic. Neurocomputing, 71(7-9): 1180–1190.
- Puterman, M. L. 1990. Markov decision processes. Handbooks in operations research and management science, 2: 331–434.
- Policy teaching via environment poisoning: Training-time adversarial attacks against reinforcement learning. In International Conference on Machine Learning, 7974–7984. PMLR.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
- Data-efficient reinforcement learning with self-predictive representations. arXiv preprint arXiv:2007.05929.
- Mastering the game of Go with deep neural networks and tree search. nature, 529(7587): 484–489.
- Deterministic policy gradient algorithms. In International conference on machine learning, 387–395. PMLR.
- Convergence results for single-step on-policy reinforcement-learning algorithms. Machine learning, 38(3): 287–308.
- Byzantine-resilient secure federated learning. IEEE Journal on Selected Areas in Communications, 39(7): 2168–2181.
- Vulnerability-aware poisoning mechanism for online rl with unknown dynamics. arXiv preprint arXiv:2009.00774.
- Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12.
- Tesauro, G.; et al. 1995. Temporal difference learning and TD-Gammon. Communications of the ACM, 38(3): 58–68.
- Data poisoning attacks against federated learning systems. In European Symposium on Research in Computer Security, 480–501. Springer.
- Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782): 350–354.
- Federated deep reinforcement learning for Internet of Things with decentralized cooperative edge caching. IEEE Internet of Things Journal, 7(10): 9441–9455.
- Security in mobile edge caching with reinforcement learning. IEEE Wireless Communications, 25(3): 116–122.
- A survey of autonomous driving: Common practices and emerging technologies. IEEE access, 8: 58443–58469.
- Poisoning attack in federated learning using generative adversarial nets. In 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), 374–380. IEEE.
- Adaptive reward-poisoning attacks against reinforcement learning. In International Conference on Machine Learning, 11225–11234. PMLR.
- Evelyn Ma (4 papers)
- Praneet Rathi (5 papers)
- S. Rasoul Etesami (33 papers)