Sample-Efficient Behavior Cloning Using General Domain Knowledge (2501.16546v1)

Published 27 Jan 2025 in cs.AI

Abstract: Behavior cloning has shown success in many sequential decision-making tasks by learning from expert demonstrations, yet they can be very sample inefficient and fail to generalize to unseen scenarios. One approach to these problems is to introduce general domain knowledge, such that the policy can focus on the essential features and may generalize to unseen states by applying that knowledge. Although this knowledge is easy to acquire from the experts, it is hard to be combined with learning from individual examples due to the lack of semantic structure in neural networks and the time-consuming nature of feature engineering. To enable learning from both general knowledge and specific demonstration trajectories, we use a LLM's coding capability to instantiate a policy structure based on expert domain knowledge expressed in natural language and tune the parameters in the policy with demonstrations. We name this approach the Knowledge Informed Model (KIM) as the structure reflects the semantics of expert knowledge. In our experiments with lunar lander and car racing tasks, our approach learns to solve the tasks with as few as 5 demonstrations and is robust to action noise, outperforming the baseline model without domain knowledge. This indicates that with the help of LLMs, we can incorporate domain knowledge into the structure of the policy, increasing sample efficiency for behavior cloning.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Feiyu Zhu (14 papers)
Jean Oh (77 papers)
Reid Simmons (18 papers)

Tweets

https://twitter.com/fly51fly/status/1884721340432998726

Sample-Efficient Behavior Cloning Using General Domain Knowledge (2501.16546v1)

Related Papers

Tweets