Papers
Topics
Authors
Recent
2000 character limit reached

Heterogeneous Knowledge for Augmented Modular Reinforcement Learning (2306.01158v3)

Published 1 Jun 2023 in cs.LG and cs.AI

Abstract: Existing modular Reinforcement Learning (RL) architectures are generally based on reusable components, also allowing for "plug-and-play" integration. However, these modules are homogeneous in nature - in fact, they essentially provide policies obtained via RL through the maximization of individual reward functions. Consequently, such solutions still lack the ability to integrate and process multiple types of information (i.e., heterogeneous knowledge representations), such as rules, sub-goals, and skills from various sources. In this paper, we discuss several practical examples of heterogeneous knowledge and propose Augmented Modular Reinforcement Learning (AMRL) to address these limitations. Our framework uses a selector to combine heterogeneous modules and seamlessly incorporate different types of knowledge representations and processing mechanisms. Our results demonstrate the performance and efficiency improvements, also in terms of generalization, that can be achieved by augmenting traditional modular RL with heterogeneous knowledge sources and processing mechanisms. Finally, we examine the safety, robustness, and interpretability issues stemming from the introduction of knowledge heterogeneity.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Autonomous Helicopter Aerobatics through Apprenticeship Learning. International Journal of Robotics Research, 29(13):1608–1639, November 2010.
  2. Modular multitask reinforcement learning with policy sketches. In Proceedings of the 34th International Conference on Machine Learning (ICML’17), 2017.
  3. Recent Advances in Hierarchical Reinforcement Learning. Discrete Event Dynamic Systems, 13(4):341–379, October 2003.
  4. Do as I can, not as I say: Grounding language in robotic affordances. In Conference on Robot Learning 2023 (CORL’23), 2023.
  5. Minimalistic gridworld environment for gymnasium, 2018. URL https://github.com/Farama-Foundation/Minigrid.
  6. Knowledge-Grounded Reinforcement Learning, 2022. arXiv:2210.03729.
  7. Flexible attention-based multi-policy fusion for efficient deep reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS’23), 2023.
  8. Feudal Reinforcement Learning. In Advances in Neural Information Processing Systems (NeurIPS’92), 1992.
  9. Learning Modular Neural Network Policies for Multi-Task and Multi-Robot Transfer. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA’17), pp.  2169–2176. IEEE Press, 2017.
  10. Bruce Digney. Learning hierarchical control structures for multiple tasks and changing environments. In Proceedings of the 5th Conference on the Simulation of Adaptive Behavior (SAB’98), volume 98, 1998.
  11. Guiding pretraining in reinforcement learning with large language models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett (eds.), Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp.  8657–8677. PMLR, 2023.
  12. MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge. In Advances in Neural Information Processing Systems (NeurIPS’22) - Datasets and Benchmarks Track, November 2022.
  13. Safe Exploration of State and Action Spaces in Reinforcement Learning. Journal of Artificial Intelligence Research, 45:515–564, December 2012.
  14. A Comprehensive Survey on Safe Reinforcement Learning. Journal of Machine Learning Research, 16(42):1437–1480, 2015.
  15. Why generalization in rl is difficult: Epistemic pomdps and implicit partial observability. In Advances in Neural Information Processing Systems (NeurIPS’21), 2021.
  16. Recurrent Independent Mechanisms. In Proceedings of the 9th International Conference on Representation Learning (ICLR’20), November 2020.
  17. Retrieval-Augmented Reinforcement Learning. In Proceedings of the 39th International Conference on Machine Learning (ICML’22), June 2022.
  18. Action selection for composable modular deep reinforcement learning. In Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS’21), 2021.
  19. Mastering Diverse Domains through World Models, 2023. arXiv:2301.04104.
  20. Contextual Markov Decision Processes, 2015. arXiv:1502.02259.
  21. Bernhard Hengst. Hierarchical Reinforcement Learning. In Claude Sammut and Geoffrey I. Webb (eds.), Encyclopedia of Machine Learning, pp.  495–502. Springer US, Boston, MA, 2010.
  22. AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning. In Proceedings of the 10th International Conference on Learning Representations (ICLR’22), May 2022.
  23. Large-Scale Retrieval for Reinforcement Learning. In Advances in Neural Information Processing Systems (NeurIPS’22), October 2022.
  24. Adaptive Mixtures of Local Experts. Neural Computation, 3:79–87, March 1991.
  25. Categorical reparameterization with gumbel-softmax. In Proceedings of the 5th International Conference on Learning Representations (ICLR’17), 2017.
  26. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3), September 2021.
  27. Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems, 23(6):4909–4926, 2022.
  28. A Survey of Zero-shot Generalisation in Deep Reinforcement Learning. Journal of Artificial Intelligence Research, 76:201–264, 2023.
  29. Motif: Intrinsic motivation from artificial intelligence feedback. In Workshop on Foundation Models for Decision Making at NeurIPS’23, 2023.
  30. Reinforcement Learning in Robotics: A Survey. International Journal of Robotics Research, 32(11):1238–1274, September 2013.
  31. Modular Lifelong Reinforcement Learning via Neural Composition. In Proceedings of the 10th International Conference on Learning Representations (ICLR’22), July 2022.
  32. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
  33. Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI’21), 5 2021.
  34. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST’23), New York, NY, USA, 2023. Association for Computing Machinery.
  35. Reinforcement Learning with Hierarchies of Machines. In Advances in Neural Information Processing Systems (NIPS’97), volume 10. MIT Press, 1997.
  36. Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials. In Proceedings of the 34th AAAI Conference on Artificial Inteligence (AAAI’20), 2020.
  37. Q-Decomposition for Reinforcement Learning Agents. In Proceedings of the 20th International Conference on Machine Learning (ICML’03), 2003.
  38. Proximal policy optimization algorithms, 2017. arXiv:1707.06347.
  39. Mastering the game of Go with deep neural networks and tree search. Nature, 529(7587):484–489, January 2016.
  40. Composable Modular Reinforcement Learning. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI’19), July 2019.
  41. An efficient initialization approach of Q-learning for mobile robots. International Journal of Control, Automation and Systems, 10(1):166–172, February 2012.
  42. Multiple-Goal Reinforcement Learning with Modular Sarsa(0). In Proceedings of the 18th International Joint Conference on Artificial Intelligence (AAAI’03), February 2003.
  43. Reinforcement Learning: An Introduction. MIT Press, 2018.
  44. Horde: A Scalable Real-time Architecture for Learning Knowledge from Unsupervised Sensorimotor Interaction. In Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS’11), May 2011.
  45. Feudal networks for hierarchical reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning (ICML’17), 2017.
  46. Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature, 575(7782):350–354, 2019.
  47. Deep reinforcement learning for semiconductor production scheduling. In Proceedings of the 29th Annual SEMI Advanced Semiconductor Manufacturing Conference (ASMC’18), pp.  301–306, 2018.
  48. Safe Reinforcement Learning with Natural Language Constraints, August 2021. arXiv:2010.05150 [cs].
  49. KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI’20), 2020.
  50. RTFM: Generalising to Novel Environment Dynamics via Reading. In Proceedings of the 8th International Conference on Learning Representations (ICLR’20), February 2020.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 18 likes about this paper.