Papers
Topics
Authors
Recent
2000 character limit reached

Recurrent Hypernetworks are Surprisingly Strong in Meta-RL

Published 26 Sep 2023 in cs.LG, cs.AI, and cs.RO | (2309.14970v4)

Abstract: Deep reinforcement learning (RL) is notoriously impractical to deploy due to sample inefficiency. Meta-RL directly addresses this sample inefficiency by learning to perform few-shot learning when a distribution of related tasks is available for meta-training. While many specialized meta-RL methods have been proposed, recent work suggests that end-to-end learning in conjunction with an off-the-shelf sequential model, such as a recurrent network, is a surprisingly strong baseline. However, such claims have been controversial due to limited supporting evidence, particularly in the face of prior work establishing precisely the opposite. In this paper, we conduct an empirical investigation. While we likewise find that a recurrent network can achieve strong performance, we demonstrate that the use of hypernetworks is crucial to maximizing their potential. Surprisingly, when combined with hypernetworks, the recurrent baselines that are far simpler than existing specialized methods actually achieve the strongest performance of all methods evaluated. We provide code at https://github.com/jacooba/hyper.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (31)
  1. Amrl: Aggregated memory for reinforcement learning. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=Bkl7bREtDr.
  2. Hypernetworks in meta-reinforcement learning. CoRL, 2022.
  3. A survey of meta-reinforcement learning. arXiv preprint arXiv:2301.08028, 2023.
  4. Principled weight initialization for hypernetworks. In International Conference on Learning Representations, 2020.
  5. Rl2: Fast reinforcement learning via slow reinforcement learning. arXiv, 2016.
  6. Model-agnostic meta-learning for fast adaptation of deep networks. ICML, 2017.
  7. Generalization of reinforcement learners with working and episodic memory. NeurIPS, 2019.
  8. Hypernetworks. In International Conference on Learning Representation (ICLR), 2017.
  9. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
  10. Meta reinforcement learning as task inference. arXiv, 2019.
  11. Learning adaptive exploration strategies in dynamic environments through informed policy regularization. arXiv preprint arXiv:2005.02934, 2020.
  12. Decoupling exploration and exploitation for meta-reinforcement learning without sacrifices. In International Conference on Machine Learning, pages 6925–6935. PMLR, 2021.
  13. A simple neural attentive meta-learner. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=B1DmUzWAW.
  14. Meta networks. ICML, 2017.
  15. Recurrent model-free RL can be a strong baseline for many POMDPs. In Proceedings of the 39th International Conference on Machine Learning. PMLR, 2022.
  16. Linear representation meta-reinforcement learning for instant adaptation. arxiv, 2021.
  17. Hypermaml: Few-shot adaptation of deep models with hypernetworks, 2022.
  18. Efficient off-policy meta-reinforcement learning via probabilistic context variables. In International conference on machine learning, pages 5331–5340. PMLR, 2019.
  19. Rapid task-solving in novel environments. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=F-mvpFpn_0q.
  20. Meta-learning with latent embedding optimization. In International Conference on Learning Representations, 2019.
  21. Recomposing the reinforcement learning building blocks with hypernetworks. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 9301–9312. PMLR, 18–24 Jul 2021.
  22. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems, pages 5026–5033. IEEE, 2012.
  23. Multimodal model-agnostic meta-learning via task-aware modulation. NeurIPS, 2019.
  24. Learning to reinforcement learn. arXiv, abs/1611.05763, 2016.
  25. Alchemy: A structured task distribution for meta-reinforcement learning. NeurIPS, 2021.
  26. Hyperdynamics: Meta-learning object and agent dynamics with hypernetworks. In International Conference on Learning Representations, 2021.
  27. Bayesian model-agnostic meta-learning. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings.neurips.cc/paper/2018/file/e1021d43911ca2c1845910d84f40aeae-Paper.pdf.
  28. Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. In CoRL, 2020.
  29. Varibad: A very good method for bayes-adaptive deep rl via meta-learning. In International Conference on Learning Representation (ICLR), 2020.
  30. Varibad: Variational bayes-adaptive deep rl via meta-learning. Journal of Machine Learning Research, 22(289):1–39, 2021.
  31. Fast context adaptation via meta-learning. ICLR, 2019.
Citations (8)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.