Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
134 tokens/sec
GPT-4o
10 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ComfyUI-Copilot: An Intelligent Assistant for Automated Workflow Development (2506.05010v1)

Published 5 Jun 2025 in cs.CL and cs.CV

Abstract: We introduce ComfyUI-Copilot, a LLM-powered plugin designed to enhance the usability and efficiency of ComfyUI, an open-source platform for AI-driven art creation. Despite its flexibility and user-friendly interface, ComfyUI can present challenges to newcomers, including limited documentation, model misconfigurations, and the complexity of workflow design. ComfyUI-Copilot addresses these challenges by offering intelligent node and model recommendations, along with automated one-click workflow construction. At its core, the system employs a hierarchical multi-agent framework comprising a central assistant agent for task delegation and specialized worker agents for different usages, supported by our curated ComfyUI knowledge bases to streamline debugging and deployment. We validate the effectiveness of ComfyUI-Copilot through both offline quantitative evaluations and online user feedback, showing that it accurately recommends nodes and accelerates workflow development. Additionally, use cases illustrate that ComfyUI-Copilot lowers entry barriers for beginners and enhances workflow efficiency for experienced users. The ComfyUI-Copilot installation package and a demo video are available at https://github.com/AIDC-AI/ComfyUI-Copilot.

Summary

  • The paper introduces ComfyUI-Copilot, an intelligent plugin that automates workflow generation and node recommendations on the ComfyUI platform using a hierarchical multi-agent framework.
  • It leverages extensive curated knowledge bases—with over 7K nodes, 62K models, and 9K workflows—to generate documentation and guide installation and configuration.
  • Offline and online evaluations reveal high recall rates (over 88.5%) and strong user acceptance (65.4% for nodes, 85.9% for workflows), highlighting its practical impact.

This paper introduces ComfyUI-Copilot (2506.05010), an intelligent assistant designed as a plugin for the ComfyUI platform to address challenges users face, such as complex workflow design, limited documentation, and model configuration issues. ComfyUI-Copilot leverages LLMs within a hierarchical multi-agent framework to offer features like automated workflow generation, intelligent node and model recommendations, and ComfyUI-related question answering.

The core of the system relies on a central LLM-based assistant agent that delegates tasks to specialized worker agents supported by curated knowledge bases. The system's key functionalities include:

  1. Automated Workflow Generation: ComfyUI-Copilot can interpret user instructions, retrieve or synthesize relevant workflows, and load them into the ComfyUI canvas with a single click. If required nodes are missing, it provides installation guidance. Workflows are represented in graph, JSON, or code formats, with conversions enabled. The system explores generating workflows from scratch using code LLMs, including fine-tuning models like Qwen2.5-Coder-7B [hui2024qwen25codertechnicalreport] on collected workflows to improve performance.
  2. Node and Model Recommendation: The assistant recommends suitable nodes based on user instructions and suggests compatible checkpoint and LoRA models, considering dependencies between components (e.g., recommending LoRAs compatible with the user's diffusion model). Recommendations follow a three-stage pipeline: LLM expansion of user intent, semantic and lexical similarity-based retrieval from knowledge bases, and re-ranking based on re-ranker scores and popularity factors (upvotes, downloads, stars).
  3. ComfyUI-related Question Answering: The copilot provides detailed information on nodes and models, including usage, installation steps, parameter explanations, and suggestions for downstream subgraphs (e.g., recommending face swapping or image upscaling subgraphs after a KSampler node). It supports multilingual queries and responses.
  4. Enhanced Features: For experienced users, ComfyUI-Copilot offers prompt writing assistance to refine text prompts for image generation and a parameter search functionality that enables parallel experiments with varying parameters to find optimal settings for workflows.

A crucial component is the construction of extensive knowledge bases covering 7K nodes, 62K models, and 9K workflows. Data is sourced from platforms, GitHub repositories, and the ComfyUI website, with automated filtering for NSFW content. For nodes lacking documentation, the system automatically generates it by setting up a sandbox environment, cloning repositories, analyzing code to extract metadata (class type, parameters), chunking and embedding the code, retrieving relevant snippets, and using an LLM (like GPT-4o) to generate documentation. For workflows and models, LLMs like GPT-4o are used to enhance documentation by leveraging multimodal understanding of community-sourced texts, images, and workflow JSONs. These KBs are continuously updated weekly to incorporate new modules and tasks.

Implemented using LangChain, the framework allows the assistant agent to autonomously select worker agents based on user input and conversation history. Responses are synthesized by integrating outputs from the relevant agents.

The effectiveness of ComfyUI-Copilot is evaluated both offline and online. Offline quantitative evaluations show high recall rates (over 88.5%) for nodes and workflows on a constructed test set using both GPT-4o and DeepSeek-V3. The workflow generation capabilities are evaluated based on pass rate (executability), average number of nodes, and node selection metrics (precision, recall, F1). Fine-tuned open-source models show comparable performance to closed-source models like Claude-3.7-Sonnet, achieving a high node selection F1 score (0.95), although overall workflow generation accuracy has room for improvement. Online user feedback since its release on GitHub demonstrates a 65.4% acceptance rate for recommended nodes and 85.9% for proposed workflows. The project has garnered significant community interest, indicated by GitHub stars, query volume, and user base size across many countries.

In conclusion, ComfyUI-Copilot is presented as the first open-source ComfyUI plugin designed to automate workflow creation and provide intelligent assistance. Its multi-agent framework and curated knowledge bases effectively lower the entry barrier for new users and improve efficiency for experienced ones in AIGC workflow development. Future work aims to integrate community feedback and enhance features like automatic workflow and parameter optimization.

Youtube Logo Streamline Icon: https://streamlinehq.com