Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning (2401.05268v4)

Published 10 Jan 2024 in cs.CL, cs.AI, cs.HC, cs.LG, and cs.MA

Abstract: Language agents have achieved considerable performance on various complex question-answering tasks by planning with external tools. Despite the incessant exploration in this field, existing language agent systems still struggle with costly, non-reproducible data reliance and face the challenge of compelling a single model for multiple functions. To this end, we introduce AutoAct, an automatic agent learning framework for QA that does not rely on large-scale annotated data and synthetic planning trajectories from closed-source models (e.g., GPT-4). Given limited data with a tool library, AutoAct first automatically synthesizes planning trajectories without any assistance from humans or strong closed-source models. Then, AutoAct leverages a division-of-labor strategy to automatically differentiate based on the target task information and synthesized trajectories, producing a sub-agent group to complete the task. We conduct comprehensive experiments with different LLMs, which demonstrates that AutoAct yields better or parallel performance compared to various strong baselines. Further analysis demonstrates the effectiveness of the division-of-labor strategy, with the trajectory quality generated by AutoAct generally outperforming that of others. Code will be available at https://github.com/zjunlp/AutoAct.

Introduction

Language agents are AI systems utilizing LLMs that can perform a variety of complex tasks by interpreting and interacting with external information. These agents have significantly advanced through the ability to understand tasks, generate plans, use external tools, and learn from past experiences. However, many existing agent learning systems depend on extensive annotated datasets and synthetic data generated by proprietary models like GPT-4. Additionally, designing agent frameworks often puts excessive pressure on a single model to master multiple functions, in contrast to the division of labor principle suggested by researcher Simon Mintrom.

AutoAct Framework

To address these issues, researchers have developed AutoAct, an agent learning framework that autonomously learns to plan and complete tasks without relying on large, annotated datasets or proprietary models. Instead, it uses a limited amount of initial data provided by users. The structure of AutoAct is highlighted by its Meta-Agent, which is capable of differentiating into a group of sub-agents, each specializing in specific functions—task decomposition, tool invocation, and self-reflection.

AutoAct begins with a process called "self-instruction," where the Meta-Agent expands a database of task data using a few given examples. Then, equipped with a tool library, it autonomously synthesizes planning trajectories. Finally, it differentiates into sub-agents optimized for specific parts of the planning process, a procedure that is both resource-efficient and adaptive to various task scenarios. A division-of-labor strategy enhances the overall capability of the agent system to address complex tasks.

Comparative Performance

Experimental assessments of AutoAct have shown that it performs comparably or even surpasses several strong baselines across different LLM platforms. One notable result is that the framework, when paired with the Llama-2-13b model, achieved comparable performance to that of the GPT-3.5-Turbo agent. This demonstrates the efficiency and effectiveness of AutoAct as it seeks to elevate the performance of open-source models to that of their closed-source counterparts.

The assessment also extended to multiple agent-learning methodologies and compared them to various prompt-based agents. AutoAct displayed impressive results, often outperforming agent-learning frameworks that emphasize either iterative planning or chain-of-thought reasoning.

Findings and Contributions

The success of AutoAct can be ascribed to several critical features. First, it negates the need for heavily annotated datasets and the use of closed-source model trajectories by automating the generation of its own training data. Second, by dividing labor among a group of specialized agents, it overcomes the limitations of overburdening a single agent with numerous planning tasks. The empirical analysis highlighted that AutoAct excels in generating high-quality planning trajectories and displays robust performance across various tasks.

The paper's main contributions lie in proposing an automatic agent learning framework that adheres to the principle of bounded rationality, showing the capability of using different LLMs to achieve outstanding performance, and revealing the effectiveness of a division-of-labor strategy within the agent learning domain.

Conclusion

AutoAct is a new development in the field of agent learning frameworks. Its design respects the limitations of individual agents and harnesses their combined strengths to handle complex tasks more efficiently. By enabling more profound learning capabilities without relying on vast amounts of training data or closed-source models, AutoAct represents a significant step forward in the ongoing development of smarter, more autonomous AI agents.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. Rest meets react: Self-improvement for multi-step reasoning llm agent.
  2. Fireact: Toward language agent fine-tuning. CoRR, abs/2310.05915.
  3. Reconcile: Round-table conference improves reasoning via consensus among diverse llms. CoRR, abs/2309.13007.
  4. Agentverse: Facilitating multi-agent collaboration and exploring emergent behaviors in agents. CoRR, abs/2308.10848.
  5. Alan Colman. 2008. Human embryonic stem cells and clinical applications. Cell Research, 18(1):S171–S171.
  6. Specializing smaller language models towards multi-step reasoning. In International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pages 10421–10430. PMLR.
  7. C. A. E. Goodhart. 1984. Problems of Monetary Management: The UK Experience, pages 91–121. Macmillan Education UK, London.
  8. Reinforced self-training (rest) for language modeling. CoRR, abs/2308.08998.
  9. Lora: Low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
  10. Large language models can self-improve. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023, pages 1051–1068. Association for Computational Linguistics.
  11. Do as I can, not as I say: Grounding language in robotic affordances. In Conference on Robot Learning, CoRL 2022, 14-18 December 2022, Auckland, New Zealand, volume 205 of Proceedings of Machine Learning Research, pages 287–318. PMLR.
  12. Self-alignment with instruction backtranslation. CoRR, abs/2308.06259.
  13. Encouraging divergent thinking in large language models through multi-agent debate. CoRR, abs/2305.19118.
  14. Generated knowledge prompting for commonsense reasoning. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 3154–3169. Association for Computational Linguistics.
  15. BOLAA: benchmarking and orchestrating llm-augmented autonomous agents. CoRR, abs/2308.05960.
  16. Ilya Loshchilov and Frank Hutter. 2019. Decoupled weight decay regularization. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net.
  17. Learn to explain: Multimodal reasoning via thought chains for science question answering. In NeurIPS.
  18. Chameleon: Plug-and-play compositional reasoning with large language models. CoRR, abs/2304.09842.
  19. Self-refine: Iterative refinement with self-feedback. CoRR, abs/2303.17651.
  20. Editing personality for llms. CoRR, abs/2310.02168.
  21. Michael Mintrom. 2015. 12Herbert A. Simon, Administrative Behavior: A Study of Decision-Making Processes in Administrative Organization. In The Oxford Handbook of Classics in Public Policy and Administration. Oxford University Press.
  22. Yohei Nakajima. 2023. Babyagi. https://github.com/yoheinakajima/babyagi.
  23. OpenAI. 2022. Chatgpt: Optimizing language models for dialogue. https://openai.com/blog/chatgpt/.
  24. OpenAI. 2023. GPT-4 technical report. CoRR, abs/2303.08774.
  25. Anton Osika. 2023. Gpt-engineer. https://github.com/AntonOsika/gpt-engineer.
  26. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology, UIST 2023, San Francisco, CA, USA, 29 October 2023- 1 November 2023, pages 2:1–2:22. ACM.
  27. Gorilla: Large language model connected with massive apis. CoRR, abs/2305.15334.
  28. Virtualhome: Simulating household activities via programs. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 8494–8502. Computer Vision Foundation / IEEE Computer Society.
  29. Making language models better tool learners with execution feedback. CoRR, abs/2305.13068.
  30. Reasoning with language model prompting: A survey. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 5368–5393. Association for Computational Linguistics.
  31. Toolllm: Facilitating large language models to master 16000+ real-world apis. CoRR, abs/2307.16789.
  32. Deepspeed: System optimizations enable training deep learning models with over 100 billion parameters. In KDD ’20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23-27, 2020, pages 3505–3506. ACM.
  33. Hugginggpt: Solving AI tasks with chatgpt and its friends in huggingface. CoRR, abs/2303.17580.
  34. Reflexion: language agents with verbal reinforcement learning. CoRR, abs/2303.11366.
  35. Alfworld: Aligning text and embodied environments for interactive learning. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net.
  36. Llm-planner: Few-shot grounded planning for embodied agents with large language models. CoRR, abs/2212.04088.
  37. Medagents: Large language models as collaborators for zero-shot medical reasoning. CoRR, abs/2311.10537.
  38. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  39. Torantulino. 2023. Autogpt: build & use ai agents. https://github.com/Significant-Gravitas.
  40. Llama 2: Open foundation and fine-tuned chat models. CoRR, abs/2307.09288.
  41. Voyager: An open-ended embodied agent with large language models. CoRR, abs/2305.16291.
  42. A survey on large language model based autonomous agents. CoRR, abs/2308.11432.
  43. Self-instruct: Aligning language models with self-generated instructions. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 13484–13508. Association for Computational Linguistics.
  44. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS.
  45. The rise and potential of large language model based agents: A survey. CoRR, abs/2309.07864.
  46. Wizardlm: Empowering large language models to follow complex instructions. CoRR, abs/2304.12244.
  47. Hotpotqa: A dataset for diverse, explainable multi-hop question answering. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31 - November 4, 2018, pages 2369–2380. Association for Computational Linguistics.
  48. Webshop: Towards scalable real-world web interaction with grounded language agents. In NeurIPS.
  49. React: Synergizing reasoning and acting in language models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
  50. Lumos: Learning agents with unified data, modular design, and open-source llms. CoRR, abs/2311.05657.
  51. Large language model as attributed training data generator: A tale of diversity and bias. CoRR, abs/2306.15895.
  52. Star: Bootstrapping reasoning with reasoning. In NeurIPS.
  53. Agenttuning: Enabling generalized agent abilities for llms. CoRR, abs/2310.12823.
  54. Exploring collaboration mechanisms for LLM agents: A social psychology view. CoRR, abs/2310.02124.
  55. Igniting language intelligence: The hitchhiker’s guide from chain-of-thought reasoning to language agents. CoRR, abs/2311.11797.
  56. Judging llm-as-a-judge with mt-bench and chatbot arena.
  57. Least-to-most prompting enables complex reasoning in large language models. In The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
  58. Webarena: A realistic web environment for building autonomous agents. CoRR, abs/2307.13854.
  59. Agents: An open-source framework for autonomous language agents. CoRR, abs/2309.07870.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Shuofei Qiao (19 papers)
  2. Ningyu Zhang (148 papers)
  3. Runnan Fang (8 papers)
  4. Yujie Luo (28 papers)
  5. Wangchunshu Zhou (73 papers)
  6. Yuchen Eleanor Jiang (19 papers)
  7. Chengfei Lv (22 papers)
  8. Huajun Chen (198 papers)
Citations (18)
Github Logo Streamline Icon: https://streamlinehq.com