Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Decentralized Monte Carlo Tree Search for Partially Observable Multi-agent Pathfinding (2312.15908v1)

Published 26 Dec 2023 in cs.AI, cs.LG, and cs.MA

Abstract: The Multi-Agent Pathfinding (MAPF) problem involves finding a set of conflict-free paths for a group of agents confined to a graph. In typical MAPF scenarios, the graph and the agents' starting and ending vertices are known beforehand, allowing the use of centralized planning algorithms. However, in this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally and are restricted in communications with each other. Specifically, we investigate the lifelong variant of MAPF, where new goals are continually assigned to the agents upon completion of previous ones. Drawing inspiration from the successful AlphaZero approach, we propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks. Our approach utilizes the agent's observations to recreate the intrinsic Markov decision process, which is then used for planning with a tailored for multi-agent tasks version of neural MCTS. The experimental results show that our approach outperforms state-of-the-art learnable MAPF solvers. The source code is available at https://github.com/AIRI-Institute/mats-lp.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. The complexity of decentralized control of Markov decision processes. Mathematics of operations research, 27(4): 819–840.
  2. Dec-MCTS: Decentralized planning for multi-robot active perception. The International Journal of Robotics Research, 38(2-3): 316–337.
  3. A survey of monte carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in games, 4(1): 1–43.
  4. Monte-Carlo robot path planning. IEEE Robotics and Automation Letters, 7(4): 11213–11220.
  5. PRIMAL _⁢2_2\_2_ 2: Pathfinding via reinforcement and imitation multi-agent learning-lifelong. IEEE Robotics and Automation Letters, 6(2): 2666–2673.
  6. Discovering faster matrix multiplication algorithms with reinforcement learning. Nature, 610(7930): 47–53.
  7. Conflict-based search with optimal task assignment. In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2018), 757–765.
  8. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1-2): 99–134.
  9. Hypertree proof search for neural theorem proving. Advances in Neural Information Processing Systems, 35: 26337–26349.
  10. Lifelong multi-agent path finding in large-scale warehouses. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021), 11272–11281.
  11. Graph neural networks for decentralized multi-robot path planning. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2020), 11785–11792. IEEE.
  12. Multi-Agent Path Finding with Prioritized Communication Learning. 2022 International Conference on Robotics and Automation (ICRA), 10695–10701.
  13. MAPPER: Multi-Agent Path Planning with Evolutionary Reinforcement Learning in Mixed Dynamic Environments. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 11748–11754.
  14. Searching with consistent prioritization for multi-agent path finding. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 7643–7650.
  15. Optimal Target Assignment and Path Finding for Teams of Agents. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), 1144–1152.
  16. Distributed heuristic multi-agent path finding with communication. In 2021 IEEE International Conference on Robotics and Automation (ICRA), 8699–8705. IEEE.
  17. Facmac: Factored multi-agent centralised policy gradients. Advances in Neural Information Processing Systems, 34: 12208–12221.
  18. Glas: Global-to-local safe autonomy synthesis for multi-robot motion planning with end-to-end learning. IEEE Robotics and Automation Letters, 5(3): 4249–4256.
  19. Rosin, C. D. 2011. Multi-armed bandits with episode context. Annals of Mathematics and Artificial Intelligence, 61(3): 203–230.
  20. Primal: Pathfinding via reinforcement and imitation multi-agent learning. IEEE Robotics and Automation Letters, 4(3): 2378–2385.
  21. Mastering atari, go, chess and shogi by planning with a learned model. Nature, 588(7839): 604–609.
  22. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
  23. Conflict-based search for optimal multi-agent pathfinding. Artificial Intelligence, 219: 40–66.
  24. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587): 484–489.
  25. Mastering the game of go without human knowledge. nature, 550(7676): 354–359.
  26. Hybrid Policy Learning for Multi-Agent Pathfinding. IEEE Access, 9: 126034–126047.
  27. Multi-agent pathfinding: Definitions, variants, and benchmarks. In Proceedings of the International Symposium on Combinatorial Search, volume 10, 151–158.
  28. M*: A complete multirobot path planning algorithm with performance bounds. In Proceedings of The 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2011), 3260–3267.
  29. Mobile robot path planning in dynamic environments through globally guided reinforcement learning. IEEE Robotics and Automation Letters, 5(4): 6932–6939.
  30. SCRIMP: Scalable Communication for Reinforcement- and Imitation-Learning-Based Multi-Agent Pathfinding. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2598–2600.
  31. Mastering atari games with limited data. Advances in Neural Information Processing Systems, 34.
  32. The surprising effectiveness of ppo in cooperative multi-agent games. Advances in Neural Information Processing Systems, 35: 24611–24624.
  33. Multiagent monte carlo tree search. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2309–2311.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Alexey Skrynnik (21 papers)
  2. Anton Andreychuk (22 papers)
  3. Konstantin Yakovlev (62 papers)
  4. Aleksandr Panov (25 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.