Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Grasper: A Generalist Pursuer for Pursuit-Evasion Problems (2404.12626v1)

Published 19 Apr 2024 in cs.AI, cs.GT, and cs.MA

Abstract: Pursuit-evasion games (PEGs) model interactions between a team of pursuers and an evader in graph-based environments such as urban street networks. Recent advancements have demonstrated the effectiveness of the pre-training and fine-tuning paradigm in PSRO to improve scalability in solving large-scale PEGs. However, these methods primarily focus on specific PEGs with fixed initial conditions that may vary substantially in real-world scenarios, which significantly hinders the applicability of the traditional methods. To address this issue, we introduce Grasper, a GeneRAlist purSuer for Pursuit-Evasion pRoblems, capable of efficiently generating pursuer policies tailored to specific PEGs. Our contributions are threefold: First, we present a novel architecture that offers high-quality solutions for diverse PEGs, comprising critical components such as (i) a graph neural network (GNN) to encode PEGs into hidden vectors, and (ii) a hypernetwork to generate pursuer policies based on these hidden vectors. As a second contribution, we develop an efficient three-stage training method involving (i) a pre-pretraining stage for learning robust PEG representations through self-supervised graph learning techniques like GraphMAE, (ii) a pre-training stage utilizing heuristic-guided multi-task pre-training (HMP) where heuristic-derived reference policies (e.g., through Dijkstra's algorithm) regularize pursuer policies, and (iii) a fine-tuning stage that employs PSRO to generate pursuer policies on designated PEGs. Finally, we perform extensive experiments on synthetic and real-world maps, showcasing Grasper's significant superiority over baselines in terms of solution quality and generalizability. We demonstrate that Grasper provides a versatile approach for solving pursuit-evasion problems across a broad range of scenarios, enabling practical deployment in real-world situations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Human-timescale adaptation in an open-ended task space. arXiv preprint arXiv:2301.07608 (2023).
  2. Multi-robot adversarial patrolling: facing a full-knowledge opponent. Journal of Artificial Intelligence Research 42 (2011), 887–916.
  3. On discrete-time pursuit-evasion games with sensing limitations. IEEE Transactions on Robotics 24, 6 (2008), 1429–1439.
  4. Jan Buermann and Jie Zhang. 2022. Multi-robot adversarial patrolling strategies via lattice paths. Artificial Intelligence 311 (2022), 103769.
  5. Exploration by random network distillation. In ICLR.
  6. Ronan Collobert and Jason Weston. 2008. A unified architecture for natural language processing: Deep neural networks with multitask learning. In ICML. 160–167.
  7. Is Nash equilibrium approximator learnable?. In AAMAS. 233–241.
  8. Are equivariant equilibrium approximators beneficial? arXiv preprint arXiv:2301.11481 (2023).
  9. Neural auto-curricula in two-player zero-sum games. In NeurIPS. 3504–3517.
  10. Counterfactual multi-agent policy gradients. In AAAI. 2974–2982.
  11. Ross Girshick. 2015. Fast R-CNN. In ICCV. 1440–1448.
  12. HyperNetworks. In ICLR.
  13. Masked autoencoders are scalable vision learners. In CVPR. 16000–16009.
  14. Karel Horák and Branislav Bošanskỳ. 2017. Dynamic programming for one-sided partially observable pursuit-evasion games. In ICAART. 503–510.
  15. GraphMAE: self-supervised masked graph autoencoders. In KDD. 594–604.
  16. A survey of multi-robot regular and adversarial patrolling. IEEE/CAA Journal of Automatica Sinica 6, 4 (2019), 894–903.
  17. Linan Huang and Quanyan Zhu. 2021. A dynamic game framework for rational and persistent robot deception with an application to deceptive pursuit-evasion. IEEE Transactions on Automation Science and Engineering 19, 4 (2021), 2918–2932.
  18. A unified game-theoretic approach to multiagent reinforcement learning. In NeurIPS. 4190–4203.
  19. Population-size-aware policy optimization for mean-field games. In ICLR.
  20. Solving large-scale pursuit-evasion games using pre-trained strategies. In AAAI. 11586–11594.
  21. CFR-MIX: Solving imperfect information extensive-form games with combinatorial action space. In IJCAI. 3663–3669.
  22. A survey of decision making in adversarial games. arXiv preprint arXiv:2207.07971 (2022).
  23. Solutions for multiagent pursuit-evasion games on communication graphs: Finite-time capture and asymptotic behaviors. IEEE Transactions on Automatic Control 65, 5 (2019), 1911–1923.
  24. Turbocharging solution concepts: Solving NEs, CEs and CCEs with neural equilibrium solvers. In NeurIPS. 5586–5600.
  25. GCC: Graph contrastive coding for graph neural network pre-training. In SIGKDD. 1150–1160.
  26. Frederick P Rivara and Christopher D Mack. 2004. Motor vehicle crash deaths related to police pursuits in the United States. Injury Prevention 10, 2 (2004), 93–95.
  27. Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017).
  28. Student of Games: a unified learning algorithm for both perfect and imperfect information games. Science Advances 9, 46 (2023), eadg3256.
  29. Multi-robot adversarial patrolling: facing coordinated attacks. In AAMAS. 1093–1100.
  30. Milind Tambe. 2011. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press.
  31. Large-scale representation learning on graphs via bootstrapping. In ICLR.
  32. Urban security: Game-theoretic resource allocation in networked domains. In AAAI. 881–886.
  33. Probabilistic pursuit-evasion games: Theory, implementation, and experimental evaluation. IEEE Transactions on Robotics and Automation 18, 5 (2002), 662–669.
  34. Sharing experience in multitask reinforcement learning. In IJCAI. 3642–3648.
  35. Cooperative control for multi-player pursuit-evasion games with reinforcement learning. Neurocomputing 412 (2020), 101–114.
  36. Multi-task reinforcement learning: A hierarchical Bayesian approach. In ICML. 1015–1022.
  37. NSGZero: efficiently learning non-exploitable policy in large-scale network security games with neural Monte Carlo tree search. In AAAI. 4646–4653.
  38. Solving large-scale extensive-form network security games via neural fictitious self-play. In IJCAI. 3713–3720.
  39. The surprising effectiveness of PPO in cooperative multi-agent games. In NeurIPS Datasets and Benchmarks Track. 24611–24624.
  40. A decentralized policy gradient approach to multi-task reinforcement learning. In UAI. 1002–1012.
  41. From canonical correlation analysis to self-supervised graph neural networks. In NeurIPS. 76–89.
  42. Optimal escape interdiction on transportation networks. In IJCAI. 3936–3944.
  43. Optimal interdiction of urban criminals with the aid of real-time information. In AAAI. 1262–1269.
  44. Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering 34, 12 (2021), 5586–5609.
  45. On the effectiveness of fine-tuning versus meta-reinforcement learning. In NeurIPS. 26519–26531.
  46. Graph contrastive learning with adaptive augmentation. In WWW. 2069–2080.
  47. Regret minimization in games with incomplete information. In NeurIPS. 1729–1736.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com