Feudal Graph Reinforcement Learning (2304.05099v6)
Abstract: Graph-based representations and message-passing modular policies constitute prominent approaches to tackling composable control problems in reinforcement learning (RL). However, as shown by recent graph deep learning literature, such local message-passing operators can create information bottlenecks and hinder global coordination. The issue becomes more serious in tasks requiring high-level planning. In this work, we propose a novel methodology, named Feudal Graph Reinforcement Learning (FGRL), that addresses such challenges by relying on hierarchical RL and a pyramidal message-passing architecture. In particular, FGRL defines a hierarchy of policies where high-level commands are propagated from the top of the hierarchy down through a layered graph structure. The bottom layers mimic the morphology of the physical system, while the upper layers correspond to higher-order sub-modules. The resulting agents are then characterized by a committee of policies where actions at a certain level set goals for the level below, thus implementing a hierarchical decision-making structure that can naturally implement task decomposition. We evaluate the proposed framework on a graph clustering problem and MuJoCo locomotion tasks; simulation results show that FGRL compares favorably against relevant baselines. Furthermore, an in-depth analysis of the command propagation mechanism provides evidence that the introduced message-passing scheme favors learning hierarchical decision-making policies.
- Mastering the game of go without human knowledge. nature, 550(7676):354–359, 2017.
- Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897):414–419, 2022.
- Outracing champion gran turismo drivers with deep reinforcement learning. Nature, 602(7896):223–228, 2022.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744, 2022.
- Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018.
- Relational inductive bias for physical construction in humans and machines. arXiv preprint arXiv:1806.01203, 2018.
- Deep reinforcement learning with relational inductive biases. In International conference on learning representations, 2018.
- Learn2assemble with structured representations and search for robotic architectural construction. In Conference on Robot Learning, pages 1401–1411. PMLR, 2022a.
- A gentle introduction to deep learning for graphs. Neural Networks, 129:203–221, 2020.
- Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478, 2021.
- Nervenet: Learning structured policy with graph neural networks. In International conference on learning representations, 2018.
- One policy to control them all: Shared modular policies for agent-agnostic control. In International Conference on Machine Learning, pages 4455–4464. PMLR, 2020.
- My body is a cage: the role of morphology in graph-based incompatible control. In International Conference on Learning Representations, 2020.
- Recent advances in hierarchical reinforcement learning. Discrete event dynamic systems, 13(1):41–77, 2003.
- Feudal reinforcement learning. Advances in neural information processing systems, 5, 1992.
- Hierarchical graph representation learning with differentiable pooling. Advances in neural information processing systems, 31, 2018.
- Spectral clustering with graph neural networks for graph pooling. In International conference on machine learning, pages 874–883. PMLR, 2020.
- Understanding pooling in graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2022.
- The expressive power of pooling in graph neural networks. arXiv preprint arXiv:2304.01575, 2023.
- Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112(1-2):181–211, 1999.
- Feudal networks for hierarchical reinforcement learning. In International Conference on Machine Learning, pages 3540–3549. PMLR, 2017.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012.
- Completely derandomized self-adaptation in evolution strategies. Evolutionary computation, 9(2):159–195, 2001.
- Graph networks as learnable physics engines for inference and control. In International Conference on Machine Learning, pages 4470–4479. PMLR, 2018.
- Structured agents for physical construction. In International conference on machine learning, pages 464–474. PMLR, 2019.
- Graph convolutional reinforcement learning. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=HkxdQkSYDB.
- Graph-based reinforcement learning meets mixed integer programs: An application to 3d robot assembly discovery. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 10215–10222. IEEE, 2022b.
- Collective intelligence for deep learning: A survey of recent developments. Collective Intelligence, 1(1):26339137221114874, 2022.
- Neural relational inference for interacting systems. In International conference on machine learning, pages 2688–2697. PMLR, 2018.
- Diffwire: Inductive graph rewiring via the lovasz bound. In The First Learning on Graphs Conference, 2022. URL https://openreview.net/pdf?id=IXvfIex0mX6f.
- Feudal Multi-Agent Hierarchies for Cooperative Reinforcement Learning. arXiv:1901.08492 [cs], January 2019. URL http://arxiv.org/abs/1901.08492. arXiv: 1901.08492.
- Reinforcement learning: an introduction. Adaptive computation and machine learning series. The MIT Press, Cambridge, Massachusetts, second edition edition, 2018. ISBN 978-0-262-03924-6.
- Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR, 2017.
- Deepmind control suite. arXiv preprint arXiv:1801.00690, 2018.
- A survey on oversmoothing in graph neural networks. arXiv preprint arXiv:2303.10993, 2023.
- Deep sets. Advances in neural information processing systems, 30, 2017.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. Journal of machine learning research, 9(11), 2008.