Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

TaskWeaver: A Code-First Agent Framework (2311.17541v3)

Published 29 Nov 2023 in cs.AI

Abstract: LLMs have shown impressive abilities in natural language understanding and generation, leading to their widespread use in applications such as chatbots and virtual assistants. However, existing LLM frameworks face limitations in handling domain-specific data analytics tasks with rich data structures. Moreover, they struggle with flexibility to meet diverse user requirements. To address these issues, TaskWeaver is proposed as a code-first framework for building LLM-powered autonomous agents. It converts user requests into executable code and treats user-defined plugins as callable functions. TaskWeaver provides support for rich data structures, flexible plugin usage, and dynamic plugin selection, and leverages LLM coding capabilities for complex logic. It also incorporates domain-specific knowledge through examples and ensures the secure execution of generated code. TaskWeaver offers a powerful and flexible framework for creating intelligent conversational agents that can handle complex tasks and adapt to domain-specific scenarios. The code is open sourced at https://github.com/microsoft/TaskWeaver/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. Autogen. Available at: https://github.com/microsoft/autogen. Accessed on [11/22/2023].
  2. Autogpt. Available at: https://github.com/Significant-Gravitas/AutoGPT. Accessed on [11/22/2023].
  3. Autogpt challenge. Available at: https://github.com/Significant-Gravitas/AutoGPT/blob/master/docs/content/challenges/memory/challenge_a.md. Accessed on [11/22/2023].
  4. Babyagi. Available at: https://github.com/yoheinakajima/babyagi. Accessed on [11/22/2023].
  5. Jarvis. Available at: https://github.com/microsoft/JARVIS. Accessed on [11/22/2023].
  6. Langchain. Available at: https://www.langchain.com/. Accessed on [11/22/2023].
  7. Llm powered autonomous agents. Available at: https://lilianweng.github.io/posts/2023-06-23-agent/. Accessed on [11/22/2023].
  8. Semantic kernel. Available at: https://github.com/microsoft/semantic-kernel. Accessed on [11/22/2023].
  9. Transformers agents. Available at: https://huggingface.co/docs/transformers/transformers_agents. Accessed on [11/22/2023].
  10. Palm 2 technical report. arXiv preprint arXiv:2305.10403, 2023.
  11. A general language assistant as a laboratory for alignment. CoRR, abs/2112.00861, 2021.
  12. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  13. Metagpt: Meta programming for multi-agent collaborative framework. arXiv preprint arXiv:2308.00352, 2023.
  14. Large language models are zero-shot reasoners. Advances in neural information processing systems, 35:22199–22213, 2022.
  15. Camel: Communicative agents for" mind" exploration of large scale language model society. arXiv preprint arXiv:2303.17760, 2023.
  16. Encouraging divergent thinking in large language models through multi-agent debate. arXiv preprint arXiv:2305.19118, 2023.
  17. OpenAI. Gpt-4 technical report. ArXiv, abs/2303.08774, 2023.
  18. Improving language understanding with unsupervised learning. OpenAI Blog, 2018.
  19. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023.
  20. A survey on large language model based autonomous agents. arXiv preprint arXiv:2308.11432, 2023.
  21. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35:24824–24837, 2022.
  22. The rise and potential of large language model based agents: A survey. arXiv preprint arXiv:2309.07864, 2023.
  23. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629, 2022.
  24. A survey of large language models. arXiv preprint arXiv:2303.18223, 2023.
  25. Agents: An open-source framework for autonomous language agents. arXiv preprint arXiv:2309.07870, 2023.
Citations (23)

Summary

  • The paper introduces TaskWeaver, a framework that converts user requests into executable code by dynamically generating Python snippets for LLM-powered agents.
  • The paper outlines a three-part architecture—Planner, Code Generator, and Code Executor—that manages task planning, domain knowledge integration, and stateful execution.
  • The framework showcases its adaptability through case studies in anomaly detection and stock forecasting while ensuring secure, restricted code generation.

TaskWeaver: A Code-First Agent Framework

The paper introduces TaskWeaver, a code-first framework for creating LLM-powered autonomous agents, addressing notable limitations in existing frameworks. The key innovation lies in converting user requests to executable code and treating user-defined plugins as callable functions. This approach provisionally supports rich data structures, dynamic plugin use, and robust integration of domain-specific knowledge, thus extending the capabilities of LLMs beyond conventional frameworks such as Langchain, Semantic Kernel, and AutoGen.

Motivation and Requirements

The motivation for TaskWeaver stems from significant gaps in pre-existing frameworks, particularly their inability to handle complex data structures, incorporate specific domain knowledge systematically, and offer sufficient flexibility for diverse user requirements. TaskWeaver aims to mitigate these challenges by:

  • Supporting rich data structures such as pandas DataFrame for advanced data processing.
  • Providing a systematic approach to embed domain-specific knowledge.
  • Enabling both plugins and code generation to meet varying user needs seamlessly.

Architecture and Design

Core Components

TaskWeaver's architecture comprises three primary components: the Planner, Code Generator (CG), and Code Executor (CE), orchestrated to create an effective loop of planning, code generation, and execution.

  1. Planner: Acts as the system's controller, breaking down user requests into manageable subtasks, managing their execution, and responding in natural language.
  2. Code Generator (CG): Generates Python code snippets tailored to user queries and integrates domain-specific plugins where necessary.
  3. Code Executor (CE): Executes the generated code, returning results and preserving execution context across conversation rounds.

Task Execution Workflow

TaskWeaver's workflow illustrates its efficiency and adaptability through iterative interactions. For instance, in an anomaly detection task, TaskWeaver starts by pulling data from a database, inspects its schema, and then executes the anomaly detection algorithm, dynamically generated through CG.

Features and Capabilities

Code-First Analysis and Restricted Code Generation

TaskWeaver's code-first approach leverages Python for data analysis tasks, establishing a native experience with familiar data structures and introducing security features through restricted code generation. This ensures code adheres to predefined safety rules and plugins provide necessary domain-specific functions.

Self-Reflection and Stateful Execution

TaskWeaver incorporates a self-reflection mechanism, allowing the Planner to reassess and adjust plans based on real-time execution results. The CE maintains a stateful execution environment, ensuring context preservation across multiple user interactions, which is critical for iterative tasks like anomaly detection.

Examples for Domain Knowledge Incorporation

TaskWeaver allows customization through examples which guide LLMs in handling domain-specific scenarios. These examples span various tasks, including planning and code generation, enhancing the LLMs' accuracy and reliability in domain-specific code execution.

Case Studies and Applications

TaskWeaver's capability is demonstrated through practical applications such as anomaly detection and stock price forecasting. By dynamically integrating plugins and generating code, TaskWeaver effectively handles complex queries, executes sophisticated algorithms, and provides user-friendly results.

Anomaly Detection

For anomaly detection in time series data:

  • TaskWeaver dynamically pulls data from an SQL database, applies a custom anomaly detection plugin, and visualizes results.
  • The workflow includes multiple decision points, ensuring high accuracy in plugin usage and code generation.

Stock Price Forecasting

In stock price forecasting:

  • TaskWeaver retrieves historical data, preprocesses it, trains an ARIMA model, and predicts future stock prices.
  • The case highlights TaskWeaver’s adaptation in handling data retrieval issues and maintains robustness through code re-generation.

Future Implications and Security

TaskWeaver’s design includes potential future enhancements such as further isolating execution environments with sandboxing techniques, supporting multi-agent configurations for complex tasks, and continuously adapting LLM prompts and responses for optimized performance. Crucially, TaskWeaver’s commitment to security through restricted operation lists and isolated processes underscores its readiness for deployment in sensitive or high-stakes environments.

Conclusion

TaskWeaver marks a significant step forward in the development of LLM-powered agents, offering a flexible, secure, and efficient framework capable of handling complex, domain-specific tasks through intelligent planning and robust code generation. Its approach can significantly enhance the usability and applicability of LLMs in various practical settings, paving the way for more sophisticated and reliable AI-driven applications.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com