Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Experiential Co-Learning of Software-Developing Agents (2312.17025v3)

Published 28 Dec 2023 in cs.CL, cs.AI, cs.LG, and cs.SE

Abstract: Recent advancements in LLMs have brought significant changes to various domains, especially through LLM-driven autonomous agents. A representative scenario is in software development, where LLM agents demonstrate efficient collaboration, task division, and assurance of software quality, markedly reducing the need for manual involvement. However, these agents frequently perform a variety of tasks independently, without benefiting from past experiences, which leads to repeated mistakes and inefficient attempts in multi-step task execution. To this end, we introduce Experiential Co-Learning, a novel LLM-agent learning framework in which instructor and assistant agents gather shortcut-oriented experiences from their historical trajectories and use these past experiences for future task execution. The extensive experiments demonstrate that the framework enables agents to tackle unseen software-developing tasks more effectively. We anticipate that our insights will guide LLM agents towards enhanced autonomy and contribute to their evolutionary growth in cooperative learning. The code and data are available at https://github.com/OpenBMB/ChatDev.

Overview of "Experiential Co-Learning of Software-Developing Agents"

The paper "Experiential Co-Learning of Software-Developing Agents" explores an innovative framework designed to overcome one of the persistent limitations of LLMs in the domain of autonomous agents: the lack of integration of past experiences into current task-solving processes. The research introduces Experiential Co-Learning (ECL), a multi-agent paradigm that enhances collaborative task solving by leveraging experiential knowledge. This approach involves instructor and assistant agents that engage in mutual reasoning through accumulated experiences, thereby enhancing the efficiency and effectiveness of tackling complex tasks, particularly in software development.

Core Components of Experiential Co-Learning

The ECL framework is structured around three integral modules: co-tracking, co-memorizing, and co-reasoning. Each of these modules plays a crucial role in refining the task-solving capabilities of language agents by building upon previous experiences.

  1. Co-Tracking Module: This module focuses on creating procedural trajectories for various training tasks. By engaging in interactive rehearsals, instructor and assistant agents collaboratively explore and document their interactions, which form the backbone of their historical knowledge base.
  2. Co-Memorizing Module: This component is dedicated to extracting "shortcuts" from historical trajectories. These shortcuts represent efficient solutions derived from external feedback, which are then stored in the agents' experience pools. By interleaving these experiential insights, the module facilitates the development of improved reasoning strategies.
  3. Co-Reasoning Module: Utilizing the collective experience pools, this module enhances the collaborative interaction between agents when confronted with new tasks. By revisiting past experiences, the agents can provide more refined instructions and responses, thereby improving their overall problem-solving effectiveness.

Methodological Insights

The methodology of the paper is grounded in the application of LLMs within autonomous agents, where these agents operate in specific roles during task execution. The ECL framework capitalizes on the inherent strengths of LLMs, such as their contextual understanding and pattern recognition capabilities, while addressing their shortcomings in experience retention and application. Through the structured decomposition of tasks and interactive instruction-response dynamics, the ECL framework enhances the agents' autonomy, thereby minimizing reliance on human intervention.

Evaluation and Results

The paper presents an empirical evaluation utilizing the NLDD dataset, which is a curated collection of natural language to software generation challenges. The experimental results underscore the substantial improvements in task-solving autonomy achieved by the ECL framework, surpassing existing benchmarks such as GPT-Engineer, MetaGPT, and ChatDev. Specifically, the ECL framework demonstrates notable enhancements in key dimensions like task completeness, executability, and consistency with natural language requirements. The autonomy metric, reflecting a holistic measure of these factors, highlights the framework's capacity to efficiently handle complex software development tasks with reduced manual oversight.

Implications and Future Prospects

The implications of the ECL framework are multifaceted, extending beyond its immediate application in software development. The integration of experiential co-learning offers a promising avenue for enhancing the generalization capabilities of autonomous agents across various domains, facilitating the development of more adaptive and resilient AI systems. From a theoretical standpoint, the paper contributes to the burgeoning discourse on the role of experiential learning in AI, presenting a paradigm that aligns with human-like learning processes.

Looking forward, the research points to several potential avenues for further development, including the refinement of heuristic reward design in co-memorizing modules, the exploration of more nuanced consistency metrics, and the expansion of experiential learning to more diverse and complex task environments. As LLMs and AI systems continue to evolve, the concepts and methodologies elucidated in this paper may serve as foundational building blocks for future advancements in the field of autonomous agents and beyond.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Agile software development methods: Review and analysis. arXiv preprint arXiv:1709.08439.
  2. Toward an assessment of software development risk. Journal of management information systems, 10(2):203–225.
  3. Large language models in machine translation.
  4. Language models are few-shot learners. In Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 1877–1901. Curran Associates, Inc.
  5. Sparks of artificial general intelligence: Early experiments with gpt-4.
  6. Large language models as tool makers.
  7. Chateval: Towards better llm-based evaluators through multi-agent debate. arXiv preprint arXiv:2308.07201.
  8. Gamegpt: Multi-agent collaborative framework for game development. arXiv preprint arXiv:2310.08067.
  9. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
  10. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents. arXiv preprint arXiv:2308.10848.
  11. Lm vs lm: Detecting factual errors via cross examination. ArXiv, abs/2305.13281.
  12. Katherine Compton and Scott Hauck. 2002. Reconfigurable computing: a survey of systems and software. ACM Computing Surveys (csuR), 34(2):171–210.
  13. Designgpt: Multi-agent collaboration in design.
  14. Mindagent: Emergent gaming interaction.
  15. MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework.
  16. War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars.
  17. Benchmarking large language models as ai research agents.
  18. Scaling laws for neural language models.
  19. Retrieval-augmented generation for knowledge-intensive nlp tasks.
  20. Camel: Communicative agents for" mind" exploration of large scale language model society. Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS).
  21. Metaagents: Simulating interactions of human behaviors for llm-based task-oriented coordination via collaborative generative agents. ArXiv, abs/2310.06500.
  22. Can large language models provide useful feedback on research papers? A large-scale empirical analysis. In arXiv preprint arXiv:2310.01783.
  23. Bolaa: Benchmarking and orchestrating llm-augmented autonomous agents.
  24. Dynamic prompt learning via policy gradient for semi-structured mathematical reasoning. In International Conference on Learning Representations (ICLR).
  25. Laser: Llm agent with state-space exploration for web navigation.
  26. Harlan D Mills. 1976. Software development. IEEE Transactions on Software Engineering, (4):265–273.
  27. Rethinking the role of demonstrations: What makes in-context learning work?
  28. Large language models as general pattern machines. arXiv preprint arXiv:2307.04721.
  29. Anton Osika. 2023. GPT-Engineer. In https://github.com/AntonOsika/gpt-engineer.
  30. Training language models to follow instructions with human feedback.
  31. Generative Agents: Interactive Simulacra of Human Behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology (UIST), pages 1–22.
  32. The waterfall model in large-scale development. In Product-Focused Software Process Improvement: 10th International Conference, PROFES 2009, Oulu, Finland, June 15-17, 2009. Proceedings 10, pages 386–400. Springer.
  33. Communicative agents for software development. arXiv preprint arXiv:2307.07924.
  34. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789.
  35. Large language models are effective text rankers with pairwise ranking prompting.
  36. Language models are unsupervised multitask learners.
  37. Toran Bruce Richards. 2023. AutoGPT. In https://github.com/Significant-Gravitas/AutoGPT.
  38. Tptu: Large language model-based ai agents for task planning and tool usage.
  39. Learning to retrieve prompts for in-context learning. arXiv preprint arXiv:2112.08633.
  40. Finding preimages in full md5 faster than exhaustive search. In Advances in Cryptology-EUROCRYPT 2009: 28th Annual International Conference on the Theory and Applications of Cryptographic Techniques, Cologne, Germany, April 26-30, 2009. Proceedings 28, pages 134–152. Springer.
  41. Toolformer: Language models can teach themselves to use tools.
  42. Role play with large language models. Nature, 623(7987):493–498.
  43. Cognitive architectures for language agents.
  44. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  45. Attention is all you need. Advances in neural information processing systems, 30.
  46. Voyager: An open-ended embodied agent with large language models. arXiv preprint arXiv:2305.16291.
  47. When large language model based agent meets user behavior analysis: A novel user simulation paradigm.
  48. Avalon’s game of thoughts: Battle against deception through recursive contemplation.
  49. Promptagent: Strategic planning with language models enables expert-level prompt optimization.
  50. Self-instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 13484–13508, Toronto, Canada. Association for Computational Linguistics.
  51. Humanoid agents: Platform for simulating human-like generative agents. arXiv preprint arXiv:2310.05418.
  52. Emergent abilities of large language models. arXiv preprint arXiv:2206.07682.
  53. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  54. Lilian Weng. 2023. Llm-powered autonomous agents. In lilianweng.github.io.
  55. Autogen: Enabling next-gen llm applications via multi-agent conversation framework.
  56. Large language models as optimizers.
  57. GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction. In Advances in Neural Information Processing Systems (NeurIPS).
  58. On generative agents in recommendation.
  59. Expel: Llm agents are experiential learners.
  60. Webarena: A realistic web environment for building autonomous agents. arXiv preprint arXiv:2307.13854.
  61. Agents: An open-source framework for autonomous language agents.
  62. Ghost in the minecraft: Generally capable agents for open-world environments via large language models with text-based knowledge and memory. arXiv preprint arXiv:2305.17144.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Chen Qian (226 papers)
  2. Yufan Dang (9 papers)
  3. Jiahao Li (80 papers)
  4. Wei Liu (1135 papers)
  5. Weize Chen (34 papers)
  6. Cheng Yang (168 papers)
  7. Zhiyuan Liu (433 papers)
  8. Maosong Sun (337 papers)
  9. Zihao Xie (11 papers)
  10. Yifei Wang (141 papers)
  11. Xin Cong (46 papers)
  12. Xiaoyin Che (5 papers)
Citations (31)
Youtube Logo Streamline Icon: https://streamlinehq.com