CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation (2411.04679v2)

Published 7 Nov 2024 in cs.AI, cs.CV, and cs.MA

Abstract: In this work, we address the cooperation problem among LLM based embodied agents, where agents must cooperate to achieve a common goal. Previous methods often execute actions extemporaneously and incoherently, without long-term strategic and cooperative planning, leading to redundant steps, failures, and even serious repercussions in complex tasks like search-and-rescue missions where discussion and cooperative plan are crucial. To solve this issue, we propose Cooperative Plan Optimization (CaPo) to enhance the cooperation efficiency of LLM-based embodied agents. Inspired by human cooperation schemes, CaPo improves cooperation efficiency with two phases: 1) meta-plan generation, and 2) progress-adaptive meta-plan and execution. In the first phase, all agents analyze the task, discuss, and cooperatively create a meta-plan that decomposes the task into subtasks with detailed steps, ensuring a long-term strategic and coherent plan for efficient coordination. In the second phase, agents execute tasks according to the meta-plan and dynamically adjust it based on their latest progress (e.g., discovering a target object) through multi-turn discussions. This progress-based adaptation eliminates redundant actions, improving the overall cooperation efficiency of agents. Experimental results on the ThreeDworld Multi-Agent Transport and Communicative Watch-And-Help tasks demonstrate that CaPo achieves much higher task completion rate and efficiency compared with state-of-the-arts.The code is released at https://github.com/jliu4ai/CaPo.

Collections

Sign up for free to add this paper to one or more collections.

Sign Up

Summary

The paper introduces CaPo, a framework that uses meta-plan generation and adaptive execution to improve multi-agent cooperation efficiency.
It leverages LLM-prompting and multi-turn discussions to strategically decompose tasks and continuously refine plans during execution.
Experimental results on TDW-MAT and C-WAH tasks demonstrate significant improvements in task completion rates and collaboration efficiency.

CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation

Introduction

The paper introduces Cooperative Plan Optimization (CaPo), a framework designed to enhance the cooperation efficiency of LLM-based embodied agents by solving the shortcomings of extemporaneous action execution in previous methods. It focuses on creating strategic and coherent cooperative plans among agents to reduce redundant steps and improve collaboration efficiency in complex tasks.

Figure 1: Procedure example of task accomplishment of CoELA and our CaPo framework.

Framework Overview

CaPo consists of two main phases: meta-plan Generation and Progress-Adaptive meta-plan and Execution. The meta-plan generation phase involves agents collaboratively formulating a meta-plan before taking any actions. This initial plan guides task decomposition into subtasks with detailed steps. It employs a meta-plan designer, responsible for creating the meta-plan, and meta-plan evaluators, who provide feedback. The progress-adaptive meta-plan and execution phase ensures the plan remains effective as agents make new progress, enabling adaptive planning to align with the latest task achievements.

Figure 2: Overview of the CaPo framework for embodied multi-agent cooperation.

Meta-plan Generation

The meta-plan generation phase involves initializing a meta-plan using LLM-prompting techniques to structure task decomposition into subtasks. This approach ensures strategic long-term cooperation among agents. A multi-turn discussion among agents follows, using their partial observations to refine the meta-plan, leading to a consensus on cooperation strategy.

Figure 3: Examples of the evaluation and optimization process via multi-turn discussion among agents.

Progress-Adaptive Planning and Execution

This phase dynamically adapts the meta-plan based on latest task progress, which could include discovering new objects or completing subtasks. Agents leverage multi-turn discussions to refine the meta-plan, ensuring its alignment with current task status and maintaining effective cooperation throughout the task execution.

Figure 4: Comparison of Transport Rate (\%) between CoELA and CaPo using GPT-3.5 under different time steps.

Experimental Results

The paper presents experimental comparisons on the ThreeDWorld Multi-Agent Transport (TDW-MAT) and Communicative Watch-And-Help (C-WAH) tasks. CaPo demonstrates higher task completion rates and improved cooperative efficiency compared to predecessors. Results show notable improvements in completion rates with an optimized meta-plan approach using various LLM configurations like GPT-3.5, LLAMA-2, and GPT-4.

Figure 5: Examples of cooperative behaviors guided by meta-plan, improving task allocation and efficiency.

Conclusion

CaPo effectively addresses the challenges of embodied multi-agent cooperation by utilizing LLMs for strategic and adaptive planning. This leads to enhanced cooperation efficiency, particularly in complex task environments. While demonstrating significant improvements, the reliance on LLM capabilities may vary based on model strength. Future work could explore mitigating these dependencies.

In summary, CaPo paves the way for more efficient and strategic cooperation among embodied agents in complex task environments, leveraging LLM strengths to optimize collaborative strategies.

PDF Markdown

Follow-up Questions

Related Papers

Authors (7)

Tweets

https://twitter.com/DanielleFong/status/1865574682310914527