Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Language Guided Exploration for RL Agents in Text Environments (2403.03141v1)

Published 5 Mar 2024 in cs.CL

Abstract: Real-world sequential decision making is characterized by sparse rewards and large decision spaces, posing significant difficulty for experiential learning systems like $\textit{tabula rasa}$ reinforcement learning (RL) agents. LLMs, with a wealth of world knowledge, can help RL agents learn quickly and adapt to distribution shifts. In this work, we introduce Language Guided Exploration (LGE) framework, which uses a pre-trained LLM (called GUIDE ) to provide decision-level guidance to an RL agent (called EXPLORER). We observe that on ScienceWorld (Wang et al.,2022), a challenging text environment, LGE outperforms vanilla RL agents significantly and also outperforms other sophisticated methods like Behaviour Cloning and Text Decision Transformer.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. Language reward modulation for pretraining reinforcement learning. ArXiv, abs/2308.12270.
  2. Prithviraj Ammanabrolu and Matthew Hausknecht. 2020. Graph constrained reinforcement learning for natural language action spaces. In International Conference on Learning Representations.
  3. Case-based reasoning for better generalization in textual reinforcement learning. In International Conference on Learning Representations.
  4. Procedure planning in instructional videos. In European Conference on Computer Vision, pages 334–350. Springer.
  5. Decision transformer: Reinforcement learning via sequence modeling. arXiv preprint arXiv:2106.01345.
  6. Textworld: A learning environment for text-based games. Computer Games, page 41–75.
  7. Bert: Pre-training of deep bidirectional transformers for language understanding. ArXiv, abs/1810.04805.
  8. Guiding pretraining in reinforcement learning with large language models. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 8657–8677. PMLR.
  9. SimCSE: Simple contrastive learning of sentence embeddings. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6894–6910, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
  10. Interactive fiction games: A colossal adventure. In AAAI Conference on Artificial Intelligence.
  11. Deep reinforcement learning with a combinatorial action space for predicting popular Reddit threads. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 1838–1848, Austin, Texas. Association for Computational Linguistics.
  12. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147. PMLR.
  13. Special feature zork: A computerized fantasy simulation game. Computer, 12(4):51–59.
  14. Pre-trained language models for interactive decision-making. Advances in Neural Information Processing Systems, 35:31199–31212.
  15. Text-based rl agents with commonsense knowledge: New challenges, environments and baselines.
  16. Pretrained language models as visual planners for human assistance. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 15302–15314.
  17. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
  18. Oyvind Tafjord and Peter Clark. 2021. General-purpose question-answering with Macaw. ArXiv, abs/2109.02593.
  19. Behavior cloned transformers are neurosymbolic reasoners. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 2777–2788, Dubrovnik, Croatia. Association for Computational Linguistics.
  20. Scienceworld: Is your agent smarter than a 5th grader? In Conference on Empirical Methods in Natural Language Processing.
  21. Keep CALM and explore: Language models for action generation in text-based games. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8736–8754, Online. Association for Computational Linguistics.
  22. Xusen Yin and Jonathan May. 2019. Learn how to cook a new recipe in a new house: Using map familiarization, curriculum learning, and bandit feedback to learn families of text-based adventure games.
  23. Learn what not to learn: Action elimination with deep reinforcement learning. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS’18, page 3566–3577, Red Hook, NY, USA. Curran Associates Inc.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Hitesh Golchha (2 papers)
  2. Sahil Yerawar (1 paper)
  3. Dhruvesh Patel (8 papers)
  4. Soham Dan (41 papers)
  5. Keerthiram Murugesan (38 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com