Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
72 tokens/sec
GPT-4o
61 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models (2309.00986v1)

Published 2 Sep 2023 in cs.CL
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models

Abstract: LLMs have recently demonstrated remarkable capabilities to comprehend human intentions, engage in reasoning, and design planning-like behavior. To further unleash the power of LLMs to accomplish complex tasks, there is a growing trend to build agent framework that equips LLMs, such as ChatGPT, with tool-use abilities to connect with massive external APIs. In this work, we introduce ModelScope-Agent, a general and customizable agent framework for real-world applications, based on open-source LLMs as controllers. It provides a user-friendly system library, with customizable engine design to support model training on multiple open-source LLMs, while also enabling seamless integration with both model APIs and common APIs in a unified way. To equip the LLMs with tool-use abilities, a comprehensive framework has been proposed spanning over tool-use data collection, tool retrieval, tool registration, memory control, customized model training, and evaluation for practical real-world applications. Finally, we showcase ModelScopeGPT, a real-world intelligent assistant of ModelScope Community based on the ModelScope-Agent framework, which is able to connect open-source LLMs with more than 1000 public AI models and localized community knowledge in ModelScope. The ModelScope-Agent library\footnote{https://github.com/modelscope/modelscope-agent} and online demo\footnote{https://modelscope.cn/studios/damo/ModelScopeGPT/summary} are now publicly available.

Overview of ModelScope-Agent Framework

The paper introduces ModelScope-Agent, an adaptable agent framework leveraging open-source LLMs as controllers to build real-world applications. It addresses the need to enhance LLMs like ChatGPT with tool-use abilities, facilitating interaction with numerous external APIs. ModelScope-Agent supports model training on diverse open-source LLMs and integrates these models with APIs, enabling them to perform complex tasks by using tools.

Technical Contributions

ModelScope-Agent provides a user-friendly system library with a customizable engine design. It integrates with both model APIs and common APIs seamlessly. The framework features several essential components:

  • Tool-Use Framework: Consists of data collection, retrieval, registration, memory control, customized model training, and evaluation processes.
  • Comprehensive API Integration: Includes more than 1000 public AI models and localized community knowledge.
  • Open-Source Accessibility: The library and a demonstrative online platform, ModelScopeGPT, are publicly available for community use.

Model Architecture

The architecture revolves around open-source LLMs such as LLaMA and ChatGLM, which act as central controllers. ModelScope-Agent includes:

  • Tool-Use Module: Configures and manages API collections, supporting diverse model and common APIs across NLP, CV, and Audio domains.
  • Memory Module: Manages system messages, user history, and contextual information.

Training and Dataset

The paper introduces MSAgent-Bench, a tool-enhanced dataset containing 598k dialogues in English and Chinese. It covers a wide array of scenarios to train the models for accurate API usage and response generation. The Weighted LLM (LM) training strategy focuses on improving API call precision.

Evaluation

ModelScope-Agent includes both automated and human evaluation frameworks:

  • Automated Evaluation: Measures the accuracy of API requests with metrics like Action EM and Argument F1.
  • Human Evaluation: Through Agent Arena, users can compare agents' performances in handling API-based tasks.

Numerical Results

The evaluation demonstrates the trained models' proficiency with significant scores in automated tests. Qwen, one of the models, showed superior performance, indicating the effectiveness of the finetuned open-source models in executing API calls accurately.

Implications and Future Directions

The ModelScope-Agent framework extends the capabilities of LLMs beyond traditional boundaries, paving the way for customizable AI agents in practical applications. Its modular design and comprehensive dataset support diverse tool integration, promising advancements in AI agent development.

The future trajectory of this research could involve refining tool-use strategies and enhancing multilingual capabilities. Open-source collaboration may lead to more versatile and robust agent systems, addressing complex real-world challenges through efficient AI-API integration.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (14)
  1. Chenliang Li (92 papers)
  2. Hehong Chen (10 papers)
  3. Ming Yan (190 papers)
  4. Weizhou Shen (18 papers)
  5. Haiyang Xu (67 papers)
  6. Zhikai Wu (3 papers)
  7. Zhicheng Zhang (76 papers)
  8. Wenmeng Zhou (14 papers)
  9. Yingda Chen (13 papers)
  10. Chen Cheng (91 papers)
  11. Hongzhu Shi (2 papers)
  12. Ji Zhang (176 papers)
  13. Fei Huang (408 papers)
  14. Jingren Zhou (198 papers)
Citations (13)
X Twitter Logo Streamline Icon: https://streamlinehq.com