Searching for Programmatic Policies in Semantic Spaces (2405.05431v2)
Abstract: Syntax-guided synthesis is commonly used to generate programs encoding policies. In this approach, the set of programs, that can be written in a domain-specific language defines the search space, and an algorithm searches within this space for programs that encode strong policies. In this paper, we propose an alternative method for synthesizing programmatic policies, where we search within an approximation of the language's semantic space. We hypothesized that searching in semantic spaces is more sample-efficient compared to syntax-based spaces. Our rationale is that the search is more efficient if the algorithm evaluates different agent behaviors as it searches through the space, a feature often missing in syntax-based spaces. This is because small changes in the syntax of a program often do not result in different agent behaviors. We define semantic spaces by learning a library of programs that present different agent behaviors. Then, we approximate the semantic space by defining a neighborhood function for local search algorithms, where we replace parts of the current candidate program with programs from the library. We evaluated our hypothesis in a real-time strategy game called MicroRTS. Empirical results support our hypothesis that searching in semantic spaces can be more sample-efficient than searching in syntax-based spaces.
- Computing hierarchical finite state controllers with classical planning. Journal of Artificial Intelligence Research, 62:755–797, 2018.
- David S. Aleixo and Levi H. S. Lelis. Show me the way! Bilevel search for synthesizing programmatic strategies. In Proceedings of the AAAI Conference on Artificial Intelligence. AAAI Press, 2023.
- Syntax-guided synthesis. pages 1–17, 10 2013.
- Combining strategic learning and tactical search in real-time strategy games. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, pages 9–15. AAAI, 2017.
- Game tree search based on non-deterministic action scripts in real-time strategy games. IEEE Transactions on Computational Intelligence and AI in Games, pages 69–77, 2017.
- Verifiable reinforcement learning via policy extraction. In Advances in Neural Information Processing Systems, pages 2499–2509, 2018.
- Automatic derivation of finite-state machines for behavior control. In Proceedings of the AAAI Conference on Artificial Intelligence, page 1656–1659. AAAI Press, 2010.
- Top-down synthesis for library learning. Procedings of the ACM on Programming Languages, 7(POPL), 2023.
- Babble: Learning better abstractions with e-graphs and anti-unification. Procedings of the ACM on Programming Languages, 7(POPL), 2023.
- Reclaiming the source of programmatic policies: Programmatic versus latent spaces. In The Twelfth International Conference on Learning Representations, 2024.
- Goal-directed hierarchical dynamic scripting for rts games. In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, pages 21–28, 2006.
- Dreamcoder: growing generalizable, interpretable knowledge with wake–sleep bayesian program learning. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 381, 06 2023.
- A generic technique for synthesizing bounded finite-state controllers. Proceedings of the International Conference on Automated Planning and Scheduling, 23(1):109–116, 2013.
- Program generation using simulated annealing and model checking. In Rocco De Nicola and Eva Kühn, editors, Software Engineering and Formal Methods, pages 155–171. Springer International Publishing, 2016.
- Synthesizing programmatic policies that inductively generalize. In International Conference on Learning Representations. OpenReview.net, 2020.
- J. R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, 1992.
- A unified game-theoretic approach to multiagent reinforcement learning. Advances in neural information processing systems, 30, 2017.
- Levi H. S. Lelis. Planning algorithms for zero-sum games with exponential action spaces: A unifying perspective. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, pages 4892–4898, 2021.
- Hierarchical programmatic reinforcement learning via learning to compose programs. arXiv preprint arXiv:2301.12950, 2023.
- Programmatic strategies for real-time strategy games. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 381–389, 2021.
- What can we learn even from the weakest? Learning sketches for programmatic strategies. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 7761–7769. AAAI Press, 2022.
- Rubens O. Moraes and Levi H. S. Lelis. Asymmetric action abstractions for multi-unit control in adversarial real-time scenarios. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. AAAI, 2018.
- Choosing well your opponents: How to guide the synthesis of programmatic strategies. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 4847–4854, 2023.
- Santiago Ontañón. Combinatorial multi-armed bandits for real-time strategy games. Journal of Artificial Intelligence Research, 58:665–702, 2017.
- Improving adaptive game AI with evolutionary learning. PhD thesis, Citeseer, 2004.
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
- Strategy generation for multi-unit real-time games via voting. IEEE Transactions on Games, 2018.
- Online adaptation of game opponent ai with dynamic scripting. International Journal of Intelligent Games and Simulation, 3(1):45–53, 2004.
- Directed search for generalized plans using classical planners. In Proceedings of the International Conference on Automated Planning and Scheduling. AAAI, 2011.
- Learning to synthesize programs as interpretable and generalizable policies. Advances in neural information processing systems, 34:25146–25163, 2021.
- Programmatically interpretable reinforcement learning. In Proceedings of the International Conference on Machine Learning, pages 5052–5061, 2018.
- Programmatically interpretable reinforcement learning. In International Conference on Machine Learning, pages 5045–5054. PMLR, 2018.
- Imitation-projected programmatic reinforcement learning. In Advances in Neural Information Processing Systems, volume 32, pages 1–12. Curran Associates, Inc., 2019.
- Rubens O. Moraes (5 papers)
- Levi H. S. Lelis (24 papers)