- The paper presents ToolFactory, a novel pipeline that transforms unstructured API documentation into structured, AI-usable tools.
- The methodology integrates APILlama with prompt tuning and a parameter database, achieving a 97% valid JSON generation rate and accurate parameter inference.
- The case study in glycomaterials research demonstrates the pipeline’s practical impact by enabling seamless integration of scientific APIs into AI workflows.
Introduction
The paper presents an innovative pipeline, ToolFactory, that automates the generation of AI-usable tools from REST API documentations by leveraging LLMs to process and understand API details written in natural language. This process is critical as APIs often lack standardized schemas, presenting challenges for tool agent development across scientific research domains. ToolFactory addresses these hurdles by providing a comprehensive approach to transform unstructured API documentation into AI-compatible tools, significantly enhancing the efficiency of API integration into AI workflows.
The API Extraction Benchmark developed in the study serves as the foundation for ToolFactory. It comprises 167 API documents and 744 endpoints, showcasing a diversity of document structures. This diversity necessitates a general tool generation pipeline capable of processing various document formats (Figure 1). The benchmark facilitates the training and validation of the automation pipeline, focusing on APIs with less structured documentation to ensure a broad applicability of the proposed method.
Figure 1: The API Extraction Benchmark includes API documents with varying levels of structures, emphasizing less structured cases to prioritize API variety and the need for a robust tool generation pipeline.
APILlama
At the core of the pipeline is APILlama, a model fine-tuned on the benchmark dataset to extract structured information from API documents using prompt tuning techniques. This approach minimizes trainable parameters—encoded via 20 trainable virtual tokens—and helps efficiently translate the unstructured API documentation into a predefined JSON schema, which is crucial for tool generation. The training process demonstrated that APILlama excels in generating correctly structured JSON files with a valid ratio of 97%, indicating substantial progress over baseline models in retrieving and interpreting API endpoints.
Once API information is structurally extracted, ToolFactory converts it into Python functions compatible with popular frameworks like LangChain. A critical phase is the tool validation process, where tools are tested using example parameter values, and only those passing the validation criteria are accepted for AI agents. The evaluation results highlight the importance of accurate parameter value inference, as many tools failed validation due to incorrect parameter values.
Parameter Value Inference
To enhance parameter value quality, the authors introduced a parameter database constructed from validated tools. This database enables inference of missing parameter values based on semantic similarity, leveraging domain-specific knowledge to refine the process (Figure 2). This solution addresses the frequent issue of insufficient documentation in APIs by utilizing real-world data as opposed to relying solely on LLM-generated pseudo-values.
Figure 2: A parameter database constructed using validated tools, inferring parameter values based on semantic similarity of parameter keys and descriptions.
Case Study: Glycomaterial Research
The ToolFactory pipeline was applied to generate tools for glycomaterials research, resulting in the development of an AI agent capable of handling glycan-related tasks like searching, drawing, and format conversion. This case study validates the pipeline's versatility across scientific domains, proving it can facilitate seamless integration of scientific APIs without extensive programming efforts (Figure 3).
Figure 3: AI Agent for Glycomaterial Research with Automated Tool Generation showcases ToolFactory's capability in simplifying database access and supporting glycan-related tasks.
Conclusion
ToolFactory marks a significant step forward in automating the generation of AI-compatible tools from REST API documentation. By translating unstructured information into structured, usable formats, ToolFactory reduces the development and learning costs associated with using APIs in AI systems. The case study in glycomaterials research underscores the pipeline’s practical benefits, enhancing scientists' ability to perform complex data integration efficiently. The work lays the groundwork for broader applications across domains, with future improvements focusing on refining parameter inference and expanding the pipeline’s adaptability to more diverse and complex API ecosystems.