Papers
Topics
Authors
Recent
Search
2000 character limit reached

E2CL: Exploration-based Error Correction Learning for Embodied Agents

Published 5 Sep 2024 in cs.CL and cs.AI | (2409.03256v2)

Abstract: LLMs are exhibiting increasing capability in knowledge utilization and reasoning. However, when applied as agents in embodied environments, they often suffer from misalignment between their intrinsic knowledge and environmental knowledge, leading to infeasible actions. Traditional environment alignment methods, such as supervised learning on expert trajectories and reinforcement learning, encounter limitations in covering environmental knowledge and achieving efficient convergence, respectively. Inspired by human learning, we propose Exploration-based Error Correction Learning (E2CL), a novel framework that leverages exploration-induced errors and environmental feedback to enhance environment alignment for embodied agents. E2CL incorporates teacher-guided and teacher-free explorations to gather environmental feedback and correct erroneous actions. The agent learns to provide feedback and self-correct, thereby enhancing its adaptability to target environments. Extensive experiments in the VirtualHome environment demonstrate that E2CL-trained agents outperform those trained by baseline methods and exhibit superior self-correction capabilities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Learning from mistakes makes llm better reasoner. arXiv preprint arXiv:2310.20689.
  2. Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073.
  3. Plasma: Making small language models better procedural knowledge models for (counterfactual) planning. ArXiv preprint.
  4. Do as i can, not as i say: Grounding language in robotic affordances. In Conference on robot learning, pages 287–318. PMLR.
  5. Grounding large language models in interactive environments with online reinforcement learning. In International Conference on Machine Learning, pages 3676–3713. PMLR.
  6. Fireact: Toward language agent fine-tuning. arXiv preprint arXiv:2310.05915.
  7. Scaling instruction-finetuned language models. Journal of Machine Learning Research, 25(70):1–53.
  8. Scene-llm: Extending language model for 3d visual understanding and reasoning. arXiv preprint arXiv:2403.11401.
  9. Leveraging pre-trained large language models to construct and utilize world models for model-based task planning. Advances in Neural Information Processing Systems, 36:79081–79094.
  10. Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992.
  11. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147. PMLR.
  12. Grounded decoding: Guiding text generation with grounded models for embodied agents. Advances in Neural Information Processing Systems, 36.
  13. Solving quantitative reasoning problems with language models. Advances in Neural Information Processing Systems, 35:3843–3857.
  14. Pre-trained language models for interactive decision-making. Advances in Neural Information Processing Systems, 35:31199–31212.
  15. Chain of hindsight aligns language models with feedback. arXiv preprint arXiv:2302.02676.
  16. Brio: Bringing order to abstractive summarization. arXiv preprint arXiv:2203.16804.
  17. Virtualhome: Simulating household activities via programs. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8494–8502.
  18. Cape: Corrective actions from precondition errors using large language models. arXiv preprint arXiv:2211.09935.
  19. Progprompt: Generating situated robot task plans using large language models. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11523–11530. IEEE.
  20. Llm-planner: Few-shot grounded planning for embodied agents with large language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2998–3009.
  21. True knowledge comes from practice: Aligning llms with embodied environments via reinforcement learning. arXiv preprint arXiv:2401.14151.
  22. Voyager: An open-ended embodied agent with large language models. URL https://arxiv. org/abs/2305.16291.
  23. Aligning language models with human preferences via a bayesian approach. Advances in Neural Information Processing Systems, 36.
  24. Learning from failure: Integrating negative examples when fine-tuning large language models as agents. arXiv preprint arXiv:2402.11651.
  25. Language models meet world models: Embodied experiences enhance language models. Advances in neural information processing systems, 36.
  26. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
  27. Agenttuning: Enabling generalized agent abilities for llms. arXiv preprint arXiv:2310.12823.
  28. The wisdom of hindsight makes language models better instruction followers. In International Conference on Machine Learning, pages 41414–41428. PMLR.
  29. Click: Controllable text generation with sequence likelihood contrastive learning. arXiv preprint arXiv:2306.03350.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 0 likes about this paper.