MARCO: Multi-Agent Real-time Chat Orchestration (2410.21784v1)
Abstract: LLM advancements have enabled the development of multi-agent frameworks to tackle complex, real-world problems such as to automate tasks that require interactions with diverse tools, reasoning, and human collaboration. We present MARCO, a Multi-Agent Real-time Chat Orchestration framework for automating tasks using LLMs. MARCO addresses key challenges in utilizing LLMs for complex, multi-step task execution. It incorporates robust guardrails to steer LLM behavior, validate outputs, and recover from errors that stem from inconsistent output formatting, function and parameter hallucination, and lack of domain knowledge. Through extensive experiments we demonstrate MARCO's superior performance with 94.48% and 92.74% accuracy on task execution for Digital Restaurant Service Platform conversations and Retail conversations datasets respectively along with 44.91% improved latency and 33.71% cost reduction. We also report effects of guardrails in performance gain along with comparisons of various LLM models, both open-source and proprietary. The modular and generic design of MARCO allows it to be adapted for automating tasks across domains and to execute complex usecases through multi-turn interactions.
- AI@Meta. 2024. Llama 3 model card.
- Anthropic. 2024. The claude 3 model family: Opus, sonnet, haiku.
- A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. Preprint, arXiv:2302.04023.
- Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1):37–46.
- Kevin A. Fischer. 2023. Reflective linguistic programming (rlp): A stepping stone in socially-aware agi (socialagi). Preprint, arXiv:2305.12647.
- Hallucinations in large multilingual translation models. Preprint, arXiv:2303.16104.
- Understanding the planning of llm agents: A survey. Preprint, arXiv:2402.02716.
- Mistral 7b. Preprint, arXiv:2310.06825.
- Mixtral of experts. Preprint, arXiv:2401.04088.
- Large language models are zero-shot reasoners. Preprint, arXiv:2205.11916.
- Gpt-4 technical report. Preprint, arXiv:2303.08774.
- Generative agents: Interactive simulacra of human behavior. Preprint, arXiv:2304.03442.
- Sayplan: Grounding large language models using 3d scene graphs for scalable robot task planning. Preprint, arXiv:2307.06135.
- Tricking llms into disobedience: Formalizing, analyzing, and detecting jailbreaks. Preprint, arXiv:2305.14965.
- Toolformer: Language models can teach themselves to use tools. Preprint, arXiv:2302.04761.
- "do anything now": Characterizing and evaluating in-the-wild jailbreak prompts on large language models. Preprint, arXiv:2308.03825.
- Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face. Preprint, arXiv:2303.17580.
- A survey on large language model based autonomous agents. Frontiers of Computer Science, 18(6).
- Chain-of-thought prompting elicits reasoning in large language models. Preprint, arXiv:2201.11903.
- React: Synergizing reasoning and acting in language models. Preprint, arXiv:2210.03629.
- Agents: An open-source framework for autonomous language agents. Preprint, arXiv:2309.07870.
- Ghost in the minecraft: Generally capable agents for open-world environments via large language models with text-based knowledge and memory. arXiv preprint arXiv:2305.17144.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.