An Algorithm For Adversary Aware Decentralized Networked MARL (2305.05573v2)
Abstract: Decentralized multi-agent reinforcement learning (MARL) algorithms have become popular in the literature since it allows heterogeneous agents to have their own reward functions as opposed to canonical multi-agent Markov Decision Process (MDP) settings which assume common reward functions over all agents. In this work, we follow the existing work on collaborative MARL where agents in a connected time varying network can exchange information among each other in order to reach a consensus. We introduce vulnerabilities in the consensus updates of existing MARL algorithms where agents can deviate from their usual consensus update, who we term as adversarial agents. We then proceed to provide an algorithm that allows non-adversarial agents to reach a consensus in the presence of adversaries under a constrained setting.
- Michael Bowling and Manuela Veloso. 2000. An analysis of stochastic game theory for multiagent reinforcement learning. Technical Report. Carnegie-Mellon Univ Pittsburgh Pa School of Computer Science.
- A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 38, 2 (2008), 156–172.
- Multiagent fully decentralized value function learning with linear convergence rates. IEEE Trans. Automat. Control 66, 4 (2020), 1497–1512.
- A distributed learning dynamics in social groups. In Proceedings of the ACM Symposium on Principles of Distributed Computing. 441–450.
- Shared experience actor-critic for multi-agent reinforcement learning. Advances in neural information processing systems 33 (2020), 10707–10717.
- Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems. Journal of machine learning research 7, 6 (2006).
- J Alexander Fax and Richard M Murray. 2004. Information flow and cooperative control of vehicle formations. IEEE transactions on automatic control 49, 9 (2004), 1465–1476.
- Resilient Consensus-based Multi-agent Reinforcement Learning. arXiv preprint arXiv:2111.06776 (2021).
- Counterfactual multi-agent policy gradients. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
- Stabilising experience replay for deep multi-agent reinforcement learning. In International conference on machine learning. PMLR, 1146–1155.
- Social influence as intrinsic motivation for multi-agent deep reinforcement learning. In International conference on machine learning. PMLR, 3040–3049.
- Jelle R Kok and Nikos Vlassis. 2006. Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research 7 (2006), 1789–1828.
- The world of independent learners is not Markovian. International Journal of Knowledge-based and Intelligent Engineering Systems 15, 1 (2011), 55–64.
- Multi-agent actor-critic for mixed cooperative-competitive environments. Advances in neural information processing systems 30 (2017).
- Contrasting centralized and decentralized critics in multi-agent reinforcement learning. arXiv preprint arXiv:2102.04402 (2021).
- Modelling the dynamic joint policy of teammates with attention multi-agent DDPG. arXiv preprint arXiv:1811.07029 (2018).
- Igor Mordatch and Pieter Abbeel. 2018. Emergence of grounded compositional language in multi-agent populations. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
- Nash equilibrium of social-learning agents in a restless multiarmed bandit game. Scientific reports 7, 1 (2017), 1937.
- Angelia Nedic and Asuman Ozdaglar. 2009. Distributed subgradient methods for multi-agent optimization. IEEE Trans. Automat. Control 54, 1 (2009), 48–61.
- Scalable multi-agent reinforcement learning for networked systems with average reward. Advances in Neural Information Processing Systems 33 (2020), 2074–2086.
- Modeling human decision making in generalized Gaussian multiarmed bandits. Proc. IEEE 102, 4 (2014), 544–571.
- Can social influence be exploited to compromise security: An online experimental evaluation. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. 593–596.
- Use of a controlled experiment and computational models to measure the impact of sequential peer exposures on decision making. Plos one 15, 7 (2020), e0234875.
- Karl H Schlag. 1998. Why imitate, and if so, how?: A boundedly rational approach to multi-armed bandits. Journal of economic theory 78, 1 (1998), 130–156.
- Segregation dynamics with reinforcement learning and agent based modeling. Scientific reports 10, 1 (2020), 11771.
- Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning. In International conference on machine learning. PMLR, 5887–5896.
- Shreyas Sundaram and Bahman Gharesifard. 2015. Consensus-based distributed optimization with malicious nodes. In 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 244–249.
- MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 5635–5640.
- A scheme for robust distributed sensor fusion based on average consensus. In IPSN 2005. Fourth International Symposium on Information Processing in Sensor Networks, 2005. IEEE, 63–70.
- Learning latent representations to influence multi-agent interaction. In Conference on robot learning. PMLR, 575–588.
- Fully decentralized multi-agent reinforcement learning with networked agents. In International Conference on Machine Learning. PMLR, 5872–5881.
- Soumajyoti Sarkar (21 papers)