E2CL: Exploration-based Error Correction Learning for Embodied Agents
Abstract: LLMs are exhibiting increasing capability in knowledge utilization and reasoning. However, when applied as agents in embodied environments, they often suffer from misalignment between their intrinsic knowledge and environmental knowledge, leading to infeasible actions. Traditional environment alignment methods, such as supervised learning on expert trajectories and reinforcement learning, encounter limitations in covering environmental knowledge and achieving efficient convergence, respectively. Inspired by human learning, we propose Exploration-based Error Correction Learning (E2CL), a novel framework that leverages exploration-induced errors and environmental feedback to enhance environment alignment for embodied agents. E2CL incorporates teacher-guided and teacher-free explorations to gather environmental feedback and correct erroneous actions. The agent learns to provide feedback and self-correct, thereby enhancing its adaptability to target environments. Extensive experiments in the VirtualHome environment demonstrate that E2CL-trained agents outperform those trained by baseline methods and exhibit superior self-correction capabilities.
- Learning from mistakes makes llm better reasoner. arXiv preprint arXiv:2310.20689.
- Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073.
- Plasma: Making small language models better procedural knowledge models for (counterfactual) planning. ArXiv preprint.
- Do as i can, not as i say: Grounding language in robotic affordances. In Conference on robot learning, pages 287–318. PMLR.
- Grounding large language models in interactive environments with online reinforcement learning. In International Conference on Machine Learning, pages 3676–3713. PMLR.
- Fireact: Toward language agent fine-tuning. arXiv preprint arXiv:2310.05915.
- Scaling instruction-finetuned language models. Journal of Machine Learning Research, 25(70):1–53.
- Scene-llm: Extending language model for 3d visual understanding and reasoning. arXiv preprint arXiv:2403.11401.
- Leveraging pre-trained large language models to construct and utilize world models for model-based task planning. Advances in Neural Information Processing Systems, 36:79081–79094.
- Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992.
- Language models as zero-shot planners: Extracting actionable knowledge for embodied agents. In International Conference on Machine Learning, pages 9118–9147. PMLR.
- Grounded decoding: Guiding text generation with grounded models for embodied agents. Advances in Neural Information Processing Systems, 36.
- Solving quantitative reasoning problems with language models. Advances in Neural Information Processing Systems, 35:3843–3857.
- Pre-trained language models for interactive decision-making. Advances in Neural Information Processing Systems, 35:31199–31212.
- Chain of hindsight aligns language models with feedback. arXiv preprint arXiv:2302.02676.
- Brio: Bringing order to abstractive summarization. arXiv preprint arXiv:2203.16804.
- Virtualhome: Simulating household activities via programs. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8494–8502.
- Cape: Corrective actions from precondition errors using large language models. arXiv preprint arXiv:2211.09935.
- Progprompt: Generating situated robot task plans using large language models. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11523–11530. IEEE.
- Llm-planner: Few-shot grounded planning for embodied agents with large language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2998–3009.
- True knowledge comes from practice: Aligning llms with embodied environments via reinforcement learning. arXiv preprint arXiv:2401.14151.
- Voyager: An open-ended embodied agent with large language models. URL https://arxiv. org/abs/2305.16291.
- Aligning language models with human preferences via a bayesian approach. Advances in Neural Information Processing Systems, 36.
- Learning from failure: Integrating negative examples when fine-tuning large language models as agents. arXiv preprint arXiv:2402.11651.
- Language models meet world models: Embodied experiences enhance language models. Advances in neural information processing systems, 36.
- React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
- Agenttuning: Enabling generalized agent abilities for llms. arXiv preprint arXiv:2310.12823.
- The wisdom of hindsight makes language models better instruction followers. In International Conference on Machine Learning, pages 41414–41428. PMLR.
- Click: Controllable text generation with sequence likelihood contrastive learning. arXiv preprint arXiv:2306.03350.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.