Papers
Topics
Authors
Recent
Search
2000 character limit reached

A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration

Published 3 Oct 2023 in cs.CL, cs.AI, and cs.MA | (2310.02170v2)

Abstract: Recent studies show that collaborating multiple LLM powered agents is a promising way for task solving. However, current approaches are constrained by using a fixed number of agents and static communication structures. In this work, we propose automatically selecting a team of agents from candidates to collaborate in a dynamic communication structure toward different tasks and domains. Specifically, we build a framework named Dynamic LLM-Powered Agent Network ($\textbf{DyLAN}$) for LLM-powered agent collaboration, operating a two-stage paradigm: (1) Team Optimization and (2) Task Solving. During the first stage, we utilize an $\textit{agent selection}$ algorithm, based on an unsupervised metric called $\textit{Agent Importance Score}$, enabling the selection of best agents according to their contributions in a preliminary trial, oriented to the given task. Then, in the second stage, the selected agents collaborate dynamically according to the query. Empirically, we demonstrate that DyLAN outperforms strong baselines in code generation, decision-making, general reasoning, and arithmetic reasoning tasks with moderate computational cost. On specific subjects in MMLU, selecting a team of agents in the team optimization stage improves accuracy by up to 25.0% in DyLAN.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (48)
  1. Let’s sample step by step: Adaptive-consistency for efficient reasoning with llms, 2023.
  2. Graph of thoughts: Solving elaborate problems with large language models, August 01, 2023 2023.
  3. Practical byzantine fault tolerance. In Proceedings of the Third Symposium on Operating Systems Design and Implementation, OSDI ’99, pp.  173–186, USA, 1999. USENIX Association. ISBN 1880446391.
  4. ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate. arXiv e-prints, art. arXiv:2308.07201, August 2023.
  5. Codet: Code generation with generated tests. In The Eleventh International Conference on Learning Representations, 2023a. URL https://openreview.net/forum?id=ktrw68Cmu9c.
  6. Evaluating large language models trained on code, July 01, 2021 2021. corrected typos, added references, added authors, added acknowledgements.
  7. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents, 2023b.
  8. Self-collaboration Code Generation via ChatGPT. arXiv e-prints, art. arXiv:2304.07590, April 2023.
  9. Improving factuality and reasoning in language models through multiagent debate, May 01, 2023 2023. Project Webpage and Code: https://composable- models.github.io/llm_debate/.
  10. Chatllm network: More brains, more intelligence, April 01, 2023 2023.
  11. Measuring massive multitask language understanding. Proceedings of the International Conference on Learning Representations (ICLR), 2021a.
  12. Measuring mathematical problem solving with the math dataset. NeurIPS, 2021b.
  13. Trueskill™: A bayesian skill rating system. In B. Schölkopf, J. Platt, and T. Hoffman (eds.), Advances in Neural Information Processing Systems, volume 19. MIT Press, 2006. URL https://proceedings.neurips.cc/paper_files/paper/2006/file/f44ee263952e65b3610b8ba51229d1f9-Paper.pdf.
  14. Metagpt: Meta programming for multi-agent collaborative framework, August 01, 2023 2023.
  15. LLM-blender: Ensembling large language models with pairwise ranking and generative fusion. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.  14165–14178, Toronto, Canada, July 2023. Association for Computational Linguistics. URL https://aclanthology.org/2023.acl-long.792.
  16. Surrealdriver: Designing generative driver agent simulation framework in urban contexts based on large language model, 2023.
  17. Encouraging divergent thinking in large language models through multi-agent debate, May 01, 2023 2023. Work in progress.
  18. An efficient and truthful pricing mechanism for team formation in crowdsourcing markets. In 2015 IEEE International Conference on Communications (ICC), pp.  567–572, 2015.
  19. Agentbench: Evaluating llms as agents, August 01, 2023 2023a. 38 pages.
  20. Bolaa: Benchmarking and orchestrating llm-augmented autonomous agents, August 01, 2023 2023b. Preprint.
  21. Chameleon: Plug-and-play compositional reasoning with large language models. arXiv preprint arXiv:2304.09842, 2023.
  22. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, pp.  4768–4777, Red Hook, NY, USA, 2017. Curran Associates Inc. ISBN 9781510860964.
  23. Self-organization in online collaborative work settings. Collective Intelligence, 1(1), sep 2022.
  24. Yohei Nakajima. Babyagi. https://github.com/yoheinakajima/babyagi, 2023.
  25. GPT-in-the-Loop: Adaptive Decision-Making for Multiagent Systems. arXiv e-prints, art. arXiv:2308.10435, August 2023.
  26. Skeleton-of-thought: Large language models can do parallel decoding, July 01, 2023 2023. Technical report, work in progress.
  27. OpenAI. Gpt-4 technical report, 2023.
  28. Matt Post. A call for clarity in reporting BLEU scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, pp.  186–191, Belgium, Brussels, October 2018. Association for Computational Linguistics. URL https://www.aclweb.org/anthology/W18-6319.
  29. Large language models are effective text rankers with pairwise ranking prompting, June 01, 2023 2023. 12 pages, 3 figures.
  30. Codebleu: a method for automatic evaluation of code synthesis. ArXiv, abs/2009.10297, 2020.
  31. Reworkd. Agentgpt. https://github.com/reworkd/AgentGPT, 2023.
  32. Toran Bruce Richards and et al. Auto-gpt: An autonomous gpt-4 experiment. https://github.com/Significant-Gravitas/Auto-GPT, 2023.
  33. Tptu: Task planning and tool usage of large language model-based ai agents, August 01, 2023 2023.
  34. Reflexion: Language agents with verbal reinforcement learning, March 01, 2023 2023. v3 contains additional citations.
  35. Voyager: An Open-Ended Embodied Agent with Large Language Models. arXiv e-prints, art. arXiv:2305.16291, May 2023.
  36. Self-consistency improves chain of thought reasoning in language models. In The Eleventh International Conference on Learning Representations, 2023a. URL https://openreview.net/forum?id=1PL1NIMMrw.
  37. Unleashing cognitive synergy in large language models: A task-solving agent through multi-persona self-collaboration, July 01, 2023 2023b. work in progress.
  38. Chain-of-thought prompting elicits reasoning in large language models, January 01, 2022 2022.
  39. Autogen: Enabling next-gen llm applications via multi-agent conversation framework, August 01, 2023 2023. 28 pages.
  40. Listwise approach to learning to rank: Theory and algorithm. In Proceedings of the 25th International Conference on Machine Learning, ICML ’08, pp.  1192–1199, New York, NY, USA, 2008. Association for Computing Machinery. ISBN 9781605582054.
  41. Examining the inter-consistency of large language models: An in-depth analysis via debate, May 01, 2023 2023.
  42. Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs. arXiv e-prints, art. arXiv:2306.13063, June 2023.
  43. React: Synergizing reasoning and acting in language models. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=WE_vluYUL-X.
  44. Nisp: Pruning networks using neuron importance score propagation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  9194–9203, 2018.
  45. Cumulative reasoning with large language models, August 01, 2023 2023.
  46. Progressive-hint prompting improves reasoning in large language models, April 01, 2023 2023. Tech Report.
  47. LLM As DBA. arXiv e-prints, art. arXiv:2308.05481, August 2023.
  48. Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory. arXiv e-prints, art. arXiv:2305.17144, May 2023.
Citations (90)

Summary

  • The paper introduces DyLAN, which leverages adaptive multi-round interactions and dynamic agent selection to enhance task-oriented collaboration.
  • It implements inference-time agent selection and automatic team optimization using an unsupervised Agent Importance Score to efficiently allocate computational resources.
  • The approach improves performance in reasoning and code generation tasks, achieving up to 35.7% accuracy on MATH and 82.9% Pass@1 on code generation benchmarks.

A Dynamic LLM-Powered Agent Network for Task-Oriented Agent Collaboration

Introduction

LLMs have demonstrated efficacy across various tasks, such as reasoning and code generation. Conventional approaches use static architectures with fixed roles for agent collaboration, limiting adaptability and requiring extensive human oversight. The paper introduces Dynamic LLM-Agent Network (DyLAN), which constructs adaptive, dynamic interaction architectures that enhance performance and efficiency. DyLAN incorporates inference-time agent selection and early-stopping mechanisms, allowing agents to interact over multiple rounds. It further employs an automatic agent team optimization algorithm based on Agent Importance Score, an unsupervised metric, selecting the most contributory agents. Figure 1

Figure 1: Overview of DyLAN showing its feed-forward answer output mechanism.

Dynamic Interaction Architecture

DyLAN leverages a feed-forward network structure, treating agents and the messages they exchange as nodes and edges, respectively, in this formulation. Agents at different time steps serve as nodes, with their communications forming directed edges. The multi-round interactions are organized into multilayered architectures, allowing dynamic reconfiguration based on query-specific requirements. This enables task-agnostic and efficient interaction patterns, improving generalization and adaptability without relying on task-specific designs.

Inference-Time Agent Selection

DyLAN implements inference-time agent selection by employing an LLM-powered ranker to dynamically select top-performing agents based on their contributions during specified interaction rounds. Inefficient agents are deactivated to focus resources on promising responses, enhancing computational efficiency without sacrificing performance. This mechanism addresses the sensitivity issues of static agent setups and promotes consensus-reaching among agents, guided by Byzantine Consensus principles.

Agent Team Optimization

Agent team optimization in DyLAN relies on a three-step procedure involving propagation, aggregation, and selection. Agents rate predecessors' solutions, with these scores aggregated to quantify each agent's contribution over time. The computed Agent Importance Score facilitates selecting an optimal subset of agents, ensuring dynamic compositions tailored to specific tasks and domains. This approach circumvents manual role designation, allowing automatic adaptation and fostering robust team configuration. Figure 2

Figure 2

Figure 2: Impact of optimized agent team size on MMLU dataset performance and efficiency.

Experimental Results

DyLAN demonstrates superior accuracy and efficiency in reasoning and code generation tasks compared to baseline methods. On MATH dataset, DyLAN achieves up to 35.7% overall accuracy, a significant improvement over single-execution baselines. In general reasoning with the MMLU dataset, accuracy benefits are notable, with DyLAN outperforming other approaches by margins reaching 4.1%. Code generation tasks display substantial gains, with DyLAN achieving 82.9% in Pass@1 metric, underscoring its proficiency in handling complex generative tasks.

Conclusion

DyLAN offers an innovative framework for collaborative agent networks powered by LLMs, resolving inefficiencies in static architectures and enhancing performance via dynamic interaction paradigms and optimized agent selection. It facilitates improved adaptability across diverse tasks, backed by empirical success demonstrated in varied applications. Future explorations may integrate DyLAN with open-source models and extend its utility to domains like software development or virtual interactions, further broadening its applicability and effectiveness. Figure 3

Figure 3

Figure 3: DyLAN's robust performance across varying temperatures on MATH and HumanEval datasets.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 1 like about this paper.