Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning (2403.19962v1)

Published 29 Mar 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Open-source pre-trained LLMs exhibit strong language understanding and generation capabilities, making them highly successful in a variety of tasks. However, when used as agents for dealing with complex problems in the real world, their performance is far inferior to large commercial models such as ChatGPT and GPT-4. As intelligent agents, LLMs need to have the capabilities of task planning, long-term memory, and the ability to leverage external tools to achieve satisfactory performance. Various methods have been proposed to enhance the agent capabilities of LLMs. On the one hand, methods involve constructing agent-specific data and fine-tuning the models. On the other hand, some methods focus on designing prompts that effectively activate the reasoning abilities of the LLMs. We explore both strategies on the 7B and 13B models. We propose a comprehensive method for constructing agent-specific data using GPT-4. Through supervised fine-tuning with constructed data, we find that for these models with a relatively small number of parameters, supervised fine-tuning can significantly reduce hallucination outputs and formatting errors in agent tasks. Furthermore, techniques such as multi-path reasoning and task decomposition can effectively decrease problem complexity and enhance the performance of LLMs as agents. We evaluate our method on five agent tasks of AgentBench and achieve satisfactory results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (36)
  1. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  2. Reconcile: Round-table conference improves reasoning via consensus among diverse llms. arXiv preprint arXiv:2309.13007.
  3. A survey for in-context learning. arXiv preprint arXiv:2301.00234.
  4. Significant Gravitas. 2023. Auto-gpt: An autonomous gpt-4 experiment.
  5. Reasoning with language model is planning with world model. arXiv preprint arXiv:2305.14992.
  6. Lora: Low-rank adaptation of large language models. arXiv preprint arXiv:2106.09685.
  7. Language models can solve computer tasks. arXiv preprint arXiv:2303.17491.
  8. Agentbench: Evaluating llms as agents. arXiv preprint arXiv:2308.03688.
  9. Self-refine: Iterative refinement with self-feedback. arXiv preprint arXiv:2303.17651.
  10. OpenAI. 2022. Introducing chatgpt.
  11. R OpenAI. 2023. Gpt-4 technical report. arxiv 2303.08774. View in Article, 2:3.
  12. Anton Osika et al. 2023. Gpt engineer.
  13. Instruction tuning with gpt-4. arXiv preprint arXiv:2304.03277.
  14. Knowledge enhanced contextual word representations. arXiv preprint arXiv:1909.04164.
  15. Autoact: Automatic agent learning from scratch via self-planning. arXiv preprint arXiv:2401.05268.
  16. Toolllm: Facilitating large language models to master 16000+ real-world apis. arXiv preprint arXiv:2307.16789.
  17. Multitask prompted training enables zero-shot task generalization. arXiv preprint arXiv:2110.08207.
  18. Reflexion: Language agents with verbal reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems.
  19. Alfworld: Aligning text and embodied environments for interactive learning. arXiv preprint arXiv:2010.03768.
  20. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432.
  21. Self-consistency improves chain of thought reasoning in language models. arXiv preprint arXiv:2203.11171.
  22. Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652.
  23. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837.
  24. Plan, eliminate, and track–language models are good teachers for embodied agents. arXiv preprint arXiv:2305.02412.
  25. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864.
  26. Self-polish: Enhance reasoning in large language models via problem refinement. arXiv preprint arXiv:2305.14497.
  27. Rewoo: Decoupling reasoning from observations for efficient augmented language models. arXiv preprint arXiv:2305.18323.
  28. Webshop: Towards scalable real-world web interaction with grounded language agents. Advances in Neural Information Processing Systems, 35:20744–20757.
  29. Tree of thoughts: Deliberate problem solving with large language models. arXiv preprint arXiv:2305.10601.
  30. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.
  31. Agenttuning: Enabling generalized agent abilities for llms. arXiv preprint arXiv:2310.12823.
  32. Instruction tuning for large language models: A survey. arXiv preprint arXiv:2308.10792.
  33. Siren’s song in the ai ocean: A survey on hallucination in large language models. arXiv preprint arXiv:2309.01219.
  34. Multimodal chain-of-thought reasoning in language models. arXiv preprint arXiv:2302.00923.
  35. Language agent tree search unifies reasoning acting and planning in language models. arXiv preprint arXiv:2310.04406.
  36. Ghost in the minecraft: Generally capable agents for open-world enviroments via large language models with text-based knowledge and memory. arXiv preprint arXiv:2305.17144.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Qinhao Zhou (4 papers)
  2. Zihan Zhang (120 papers)
  3. Xiang Xiang (22 papers)
  4. Ke Wang (529 papers)
  5. Yuchuan Wu (33 papers)
  6. Yongbin Li (128 papers)
Citations (5)