ROMA-iQSS: An Objective Alignment Approach via State-Based Value Learning and ROund-Robin Multi-Agent Scheduling (2404.03984v1)
Abstract: Effective multi-agent collaboration is imperative for solving complex, distributed problems. In this context, two key challenges must be addressed: first, autonomously identifying optimal objectives for collective outcomes; second, aligning these objectives among agents. Traditional frameworks, often reliant on centralized learning, struggle with scalability and efficiency in large multi-agent systems. To overcome these issues, we introduce a decentralized state-based value learning algorithm that enables agents to independently discover optimal states. Furthermore, we introduce a novel mechanism for multi-agent interaction, wherein less proficient agents follow and adopt policies from more experienced ones, thereby indirectly guiding their learning process. Our theoretical analysis shows that our approach leads decentralized agents to an optimal collective policy. Empirical experiments further demonstrate that our method outperforms existing decentralized state-based and action-based value learning strategies by effectively identifying and aligning optimal objectives.
- G. Papoudakis, F. Christianos, L. Schäfer, and S. V. Albrecht, “Benchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative Tasks,” in Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS), 2021. [Online]. Available: http://arxiv.org/abs/2006.07869
- J. J. Koh, G. Ding, C. Heckman, L. Chen, and A. Roncone, “Cooperative Control of Mobile Robots with Stackelberg Learning,” in Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020, pp. 7985–7992.
- X. Wang, L. Ke, Z. Qiao, and X. Chai, “Large-scale Traffic Signal Control Using a Novel Multiagent Reinforcement Learning,” IEEE Transactions on Cybernetics, vol. 51, no. 1, pp. 174–187, 2021.
- H. Wei, N. Xu, H. Zhang, G. Zheng, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li, “CoLight: Learning Network-Level Cooperation for Traffic Signal Control,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management. ACM, Nov 2019.
- J. Brawer, D. Ghose, K. Candon, M. Qin, A. Roncone, M. Vázquez, and B. Scassellati, “Interactive Policy Shaping for Human-Robot Collaboration with Transparent Matrix Overlays,” in Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction, 2023, pp. 525–533.
- Y.-S. Tung, M. B. Luebbers, A. Roncone, and B. Hayes, “Workspace Optimization Techniques to Improve Prediction of Human Motion During Human-Robot Collaboration,” in In Proceedings of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (HRI’24), ACM/IEEE. New York, NY, USA: ACM, Mar 2024. [Online]. Available: https://doi.org/10.1145/3610977.3635003
- K. Son, D. Kim, W. J. Kang, D. E. Hostallero, and Y. Yi, “QTRAN: Learning to factorize with transformation for cooperative multi-agent reinforcement learning,” in International conference on machine learning. PMLR, 2019, pp. 5887–5896.
- J. K. Gupta, M. Egorov, and M. Kochenderfer, “Cooperative Multi-Agent Control using Deep Reinforcement Learning,” in Autonomous Agents and Multiagent Systems: AAMAS 2017 Workshops, Best Papers, Revised Selected Papers 16. Springer, 2017, pp. 66–83.
- R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments,” in Advances in Neural Information Processing Systems, 2017.
- S. Iqbal and F. Sha, “Actor-Attention-Critic for Multi-Agent Reinforcement Learning,” in Proceedings of the International Conference on Machine Learning, 2019.
- M. Tan, “Multi-agent Reinforcement Learning: Independent vs. cooperative Agents,” in Proceedings of the tenth international conference on machine learning, 1993, pp. 330–337.
- C. S. de Witt, T. Gupta, D. Makoviichuk, V. Makoviychuk, P. H. S. Torr, M. Sun, and S. Whiteson, “Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?” 2020.
- G. Palmer, K. Tuyls, D. Bloembergen, and R. Savani, “Lenient Multi-Agent Deep Reinforcement Learning,” in Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2018.
- L. Matignon, G. J. Laurent, and N. Le Fort-Piat, “Hysteretic Q-learning: An algorithm for Decentralized Reinforcement Learning in Cooperative Multi-Agent Teams,” in Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2007.
- J. Jiang and Z. Lu, “I2Q: A Fully Decentralized Q-Learning Algorithm,” Thirty-Sixth Annual Conference on Neural Information Processing Systems, 2022.
- M. Tan, “Multi-Agent Reinforcement Learning: Independent vs. Cooperative Learning,” Readings in Agents, pp. 487–494, 1997.
- C. J. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, pp. 279–292, 1992.
- A. Edwards, H. Sahni, R. Liu, J. Hung, A. Jain, R. Wang, A. Ecoffet, T. Miconi, C. Isbell, and J. Yosinski, “Estimating Q(s,s’) with Deep Deterministic Dynamics Gradients,” in Proceedings of the 37th International Conference on Machine Learning, 2020, pp. 2825–2835.
- Chi-Hui Lin (3 papers)
- Joewie J. Koh (4 papers)
- Alessandro Roncone (33 papers)
- Lijun Chen (43 papers)