Efficient Learning in Chinese Checkers: Comparing Parameter Sharing in Multi-Agent Reinforcement Learning (2405.18733v1)
Abstract: We show that multi-agent reinforcement learning (MARL) with full parameter sharing outperforms independent and partially shared architectures in the competitive perfect-information homogenous game of Chinese Checkers. To run our experiments, we develop a new MARL environment: variable-size, six-player Chinese Checkers. This custom environment was developed in PettingZoo and supports all traditional rules of the game including chaining jumps. This is, to the best of our knowledge, the first implementation of Chinese Checkers that remains faithful to the true game. Chinese Checkers is difficult to learn due to its large branching factor and potentially infinite horizons. We borrow the concept of branching actions (submoves) from complex action spaces in other RL domains, where a submove may not end a player's turn immediately. This drastically reduces the dimensionality of the action space. Our observation space is inspired by AlphaGo with many binary game boards stacked in a 3D array to encode information. The PettingZoo environment, training and evaluation logic, and analysis scripts can be found on \href{https://github.com/noahadhikari/pettingzoo-chinese-checkers}{Github}.
- S. Gronauer and K. Diepold, “Multi-agent deep reinforcement learning: a survey,” Artif Intell Rev, vol. 55, pp. 895–943, 2022.
- A. Oroojlooy and D. Hajinezhad, “A review of cooperative multi-agent deep reinforcement learning,” Appl Intell, vol. 53, pp. 13677–13722, 2023. https://doi.org/10.1007/s10489-022-04105-y.
- J. K. Terry, B. Black, A. Hari, L. S. Santos, C. Dieffendahl, N. L. Williams, Y. Lokesh, C. Horsch, and P. Ravi, “Pettingzoo: Gym for multi-agent reinforcement learning,” CoRR, vol. abs/2009.14471, 2020.
- G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” CoRR, vol. abs/1606.01540, 2016.
- Z. Liu, M. Zhou, W. Cao, Q. Qu, H. W. F. Yeung, and V. Y. Y. Chung, “Towards understanding chinese checkers with heuristics, monte carlo tree search, and deep reinforcement learning,” CoRR, vol. abs/1903.01747, 2019.
- S. He, W. Hu, and H. Yin, “Playing chinese checkers with reinforcement learning,” 2016.
- M. T. Games, “The rules of chinese checkers,” 2023. https://www.mastersofgames.com/rules/chinese-checkers-rules.htm.
- R. B. Games, “Hexagonal grids,” 2021. https://www.redblobgames.com/grids/hexagons/.
- J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” 2017.
- FIDE, “Laws of chess,” p. 13, 2008.
- A. Tavakoli, F. Pardo, and P. Kormushev, “Action branching architectures for deep reinforcement learning,” CoRR, vol. abs/1711.08946, 2017.
- Y. Du, P. Abbeel, and A. Grover, “It takes four to tango: Multiagent selfplay for automatic curriculum generation,” CoRR, 2022.
- Noah Adhikari (1 paper)
- Allen Gu (1 paper)