Papers
Topics
Authors
Recent
Search
2000 character limit reached

Multi-granularity Knowledge Transfer for Continual Reinforcement Learning

Published 25 Jan 2024 in cs.LG and cs.AI | (2401.15098v3)

Abstract: Continual reinforcement learning (CRL) empowers RL agents with the ability to learn a sequence of tasks, accumulating knowledge learned in the past and using the knowledge for problemsolving or future task learning. However, existing methods often focus on transferring fine-grained knowledge across similar tasks, which neglects the multi-granularity structure of human cognitive control, resulting in insufficient knowledge transfer across diverse tasks. To enhance coarse-grained knowledge transfer, we propose a novel framework called MT-Core (as shorthand for Multi-granularity knowledge Transfer for Continual reinforcement learning). MT-Core has a key characteristic of multi-granularity policy learning: 1) a coarsegrained policy formulation for utilizing the powerful reasoning ability of the LLM to set goals, and 2) a fine-grained policy learning through RL which is oriented by the goals. We also construct a new policy library (knowledge base) to store policies that can be retrieved for multi-granularity knowledge transfer. Experimental results demonstrate the superiority of the proposed MT-Core in handling diverse CRL tasks versus popular baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (42)
  1. Policy and value transfer in lifelong reinforcement learning. In ICML, pages 20–29, 2018.
  2. Language reward modulation for pretraining reinforcement learning. arXiv preprint arXiv:2308.12270, 2023.
  3. Compositional foundation models for hierarchical planning. arXiv preprint arXiv:2309.08587, 2023.
  4. Comps: Continual meta policy search. In ICLR, 2022.
  5. Task-Agnostic Continual Reinforcement Learning: Gaining Insights and Overcoming Challenges. In CoLLAs, 2023.
  6. Near-optimal goal-oriented reinforcement learning in non-stationary environments. Advances in Neural Information Processing Systems, 35:33973–33984, 2022.
  7. Minigrid & miniworld: Modular & customizable reinforcement learning environments for goal-oriented tasks. arXiv preprint arXiv:2306.13831, 2023.
  8. Autotelic agents with intrinsically motivated goal-conditioned reinforcement learning: A short survey. Journal of Artificial Intelligence Research, 74:1159–1199, 2022.
  9. Collaborating with language models for embodied reasoning. In LaReL, 2022.
  10. Don’t forget, there is more than forgetting: New metrics for continual learning. arXiv preprint arXiv:1810.13166, 2018.
  11. Feature control as intrinsic motivation for hierarchical reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 30(11):3409–3418, 2019.
  12. Evidence for hierarchical cognitive control in the human cerebellum. Current Biology, 30(10):1881–1892.e3, 2020.
  13. The role of prefrontal cortex in cognitive control and executive function. Neuropsychopharmacology, 47(1):72–89, 2022.
  14. Building a Subspace of Policies for Scalable Continual Learning. In ICLR, 2023.
  15. Enabling intelligent interactions between an agent and an llm: A reinforcement learning approach. arXiv preprint arXiv:2306.03604, 2023.
  16. Billion-scale similarity search with GPUs. IEEE Transactions on Big Data, 7(3):535–547, 2019.
  17. Same state, different task: Continual reinforcement learning without interference. In AAAI, pages 7143–7151, 2022.
  18. Towards continual reinforcement learning: A review and perspectives. Journal of Artificial Intelligence Research, 75:1401–1476, 2022.
  19. Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, 114(13):3521–3526, 2017.
  20. Biological underpinnings for lifelong learning machines. Nature Machine Intelligence, 4(3):196–210, 2022.
  21. Reward Design with Language Models. In ICLR, 2023.
  22. Cl-wstc: Continual learning for weakly supervised text classification on the internet. In WWW, pages 1489–1499, 2023.
  23. Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency. arXiv preprint arXiv:2309.17382, 2023.
  24. Packnet: Adding multiple tasks to a single network by iterative pruning. In CVPR, pages 7765–7773, 2018.
  25. Towards A Unified Agent with Foundation Models. In ICLR, 2023.
  26. Stable-baselines3: Reliable reinforcement learning implementations. The Journal of Machine Learning Research, 22(1):12348–12355, 2021.
  27. Experience replay for continual learning. In NeurIPS, pages 348–358, 2019.
  28. Progressive Neural Networks. arXiv preprint arXiv:1606.04671, 2016.
  29. Hypernetwork-PPO for Continual Reinforcement Learning. In NeurIPS DeepRL Workshop, 2022.
  30. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  31. Powerpropagation: A sparsity inducing weight reparameterisation. In NeurIPS, pages 28889–28903, 2021.
  32. Reflexion: Language agents with verbal reinforcement learning. In NeurIPS, 2023.
  33. Voyager: An Open-Ended Embodied Agent with Large Language Models. arXiv preprint arXiv:2305.16291, 2023.
  34. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432, 2023.
  35. A comprehensive survey of continual learning: Theory, method and application. arXiv preprint arXiv:2302.00487, 2023.
  36. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  37. Continual world: A robotic benchmark for continual reinforcement learning. In NeurIPS, pages 28496–28510, 2021.
  38. Disentangling transfer in continual reinforcement learning. NeurIPS, 35:6304–6317, 2022.
  39. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
  40. Language to Rewards for Robotic Skill Synthesis. In CoRL, 2023.
  41. Bootstrap Your Own Skills: Learning to Solve New Tasks with Large Language Model Guidance. In CoRL, 2023.
  42. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
Citations (1)

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 1 like about this paper.