Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models (2407.03964v1)

Published 4 Jul 2024 in cs.CL and cs.LG

Abstract: Low sample efficiency is an enduring challenge of reinforcement learning (RL). With the advent of versatile LLMs, recent works impart common-sense knowledge to accelerate policy learning for RL processes. However, we note that such guidance is often tailored for one specific task but loses generalizability. In this paper, we introduce a framework that harnesses LLMs to extract background knowledge of an environment, which contains general understandings of the entire environment, making various downstream RL tasks benefit from one-time knowledge representation. We ground LLMs by feeding a few pre-collected experiences and requesting them to delineate background knowledge of the environment. Afterward, we represent the output knowledge as potential functions for potential-based reward shaping, which has a good property for maintaining policy optimality from task rewards. We instantiate three variants to prompt LLMs for background knowledge, including writing code, annotating preferences, and assigning goals. Our experiments show that these methods achieve significant sample efficiency improvements in a spectrum of downstream tasks from Minigrid and Crafter domains.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Fuxiang Zhang (9 papers)
  2. Junyou Li (13 papers)
  3. Yi-Chen Li (10 papers)
  4. Zongzhang Zhang (33 papers)
  5. Yang Yu (385 papers)
  6. Deheng Ye (50 papers)

Summary

We haven't generated a summary for this paper yet.