Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence (2401.03489v1)

Published 7 Jan 2024 in cs.LG, cs.AI, cs.DC, and cs.MA

Abstract: In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide the first decentralized Byzantine fault-tolerant FRL method. Towards this end, we first propose a new centralized Byzantine fault-tolerant policy gradient (PG) algorithm that improves over existing methods by relying only on assumptions standard for non-fault-tolerant PG. Then, as our main contribution, we show how a combination of robust aggregation and Byzantine-resilient agreement methods can be leveraged in order to eliminate the need for a trusted central entity. Since our results represent the first sample complexity analysis for Byzantine fault-tolerant decentralized federated non-convex optimization, our technical contributions may be of independent interest. Finally, we corroborate our theoretical results experimentally for common RL environments, demonstrating the speed-up of decentralized federations w.r.t. the number of participating agents and resilience against various Byzantine attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.
  2. Federated learning: Challenges, methods, and future directions. IEEE signal processing magazine, 37(3):50–60, 2020.
  3. Federated reinforcement learning: Techniques, applications, and open challenges. arXiv preprint arXiv:2108.11887, 2021.
  4. Federated deep reinforcement learning. arXiv preprint arXiv:1901.08277, 2019.
  5. Fault-tolerant federated reinforcement learning with theoretical guarantee. In Advances in Neural Information Processing Systems, volume 34, pages 1007–1021, 2021.
  6. Fedhql: Federated heterogeneous q-learning. arXiv:2301.11135, 2023.
  7. Mdpgt: momentum-based decentralized policy gradient tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 9377–9385, 2022.
  8. Collaborative learning in the jungle (decentralized, byzantine, heterogeneous, asynchronous and nonconvex learning). Advances in Neural Information Processing Systems, 34:25044–25057, 2021.
  9. Reinforcement learning: An introduction. MIT press, 2018.
  10. Trust region policy optimization. In International conference on machine learning, pages 1889–1897, 2015.
  11. Proximal policy optimization algorithms. arXiv:1707.06347, 2017.
  12. Ronald J Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229–256, 1992.
  13. Infinite-horizon policy-gradient estimation. Journal of Artificial Intelligence Research, 15:319–350, 2001.
  14. Stochastic variance reduction for nonconvex optimization. In International conference on machine learning, pages 314–323, 2016.
  15. Matteo Papini. Safe policy optimization. 2021.
  16. An improved convergence analysis of stochastic variance-reduced policy gradient. In Uncertainty in Artificial Intelligence, pages 541–551. PMLR, 2020.
  17. Page: A simple and optimal probabilistic gradient estimator for nonconvex optimization. In International conference on machine learning, pages 6286–6295, 2021.
  18. Page-pg: A simple and loopless variance-reduced policy gradient method with probabilistic gradient estimation. In International Conference on Machine Learning, pages 7223–7240, 2022.
  19. The byzantine generals problem. ACM Trans. Program. Lang. Syst., 4(3), 1982.
  20. Machine learning with adversaries: Byzantine tolerant gradient descent. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 118–128, 2017a.
  21. Byzantine stochastic gradient descent. Advances in Neural Information Processing Systems, 31, 2018.
  22. Byzantine-resilient non-convex stochastic gradient descent. arXiv:2012.14368, 2020.
  23. Byzantine machine learning made easy by resilient averaging of momentums. In International Conference on Machine Learning, pages 6246–6283, 2022.
  24. Variance reduction is an antidote to byzantines: Better rates, weaker assumptions and communication compression as a cherry on the top. In The Eleventh International Conference on Learning Representations, 2023.
  25. Multidimensional approximate agreement in byzantine asynchronous systems. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 391–400, 2013.
  26. Multidimensional agreement in byzantine systems. Distributed Computing, 28(6):423–441, 2015.
  27. Stochastic variance-reduced policy gradient. In International conference on machine learning, pages 4026–4035, 2018.
  28. Stochastic recursive momentum for policy gradient methods. arXiv preprint arXiv:2003.04302, 2020.
  29. Learning from history for byzantine robust optimization. In International Conference on Machine Learning, pages 5311–5319, 2021.
  30. C Cachin and V Shoup. Random oracles in constantinople: Practical asynchronous byzantine agreement using. In Proceedings of the 19th ACM Symposium on Principles of Distributed Computing, no, pages 1–26, 2000.
  31. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE transactions on systems, man, and cybernetics, pages 834–846, 1983.
  32. Accelerating stochastic gradient descent using predictive variance reduction. Advances in neural information processing systems, 26, 2013.
  33. Byzantine-robust learning on heterogeneous datasets via bucketing. In International Conference on Learning Representations, 2022.
  34. Machine learning with adversaries: Byzantine tolerant gradient descent. Advances in neural information processing systems, 30, 2017b.
  35. Robust aggregation for federated learning. IEEE Transactions on Signal Processing, 70:1142–1154, 2022.
  36. Endre Weiszfeld. Sur le point pour lequel la somme des distances de n points donnés est minimum. Tohoku Mathematical Journal, First Series, 43:355–386, 1937.
  37. Adam: A method for stochastic optimization. arXiv:1412.6980, 2014.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Philip Jordan (2 papers)
  2. Florian Grötschla (22 papers)
  3. Flint Xiaofeng Fan (11 papers)
  4. Roger Wattenhofer (212 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.