Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Do We Really Need a Complex Agent System? Distill Embodied Agent into a Single Model (2404.04619v1)

Published 6 Apr 2024 in cs.AI and cs.CV

Abstract: With the power of LLMs, open-ended embodied agents can flexibly understand human instructions, generate interpretable guidance strategies, and output executable actions. Nowadays, Multi-modal LLMs~(MLMs) integrate multi-modal signals into LLMs, further bringing richer perception to entity agents and allowing embodied agents to perceive world-understanding tasks more delicately. However, existing works: 1) operate independently by agents, each containing multiple LLMs, from perception to action, resulting in gaps between complex tasks and execution; 2) train MLMs on static data, struggling with dynamics in open-ended scenarios; 3) input prior knowledge directly as prompts, suppressing application flexibility. We propose STEVE-2, a hierarchical knowledge distillation framework for open-ended embodied tasks, characterized by 1) a hierarchical system for multi-granular task division, 2) a mirrored distillation method for parallel simulation data, and 3) an extra expert model for bringing additional knowledge into parallel simulation. After distillation, embodied agents can complete complex, open-ended tasks without additional expert guidance, utilizing the performance and knowledge of a versatile MLM. Extensive evaluations on navigation and creation tasks highlight the superior performance of STEVE-2 in open-ended tasks, with $1.4 \times$ - $7.3 \times$ in performance.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Zhonghan Zhao (11 papers)
  2. Ke Ma (75 papers)
  3. Wenhao Chai (50 papers)
  4. Xuan Wang (205 papers)
  5. Kewei Chen (13 papers)
  6. Dongxu Guo (5 papers)
  7. Yanting Zhang (26 papers)
  8. Hongwei Wang (150 papers)
  9. Gaoang Wang (68 papers)
Citations (9)