Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

From Words to Structured Visuals: A Benchmark and Framework for Text-to-Diagram Generation and Editing (2411.11916v1)

Published 18 Nov 2024 in cs.DB

Abstract: We introduce the task of text-to-diagram generation, which focuses on creating structured visual representations directly from textual descriptions. Existing approaches in text-to-image and text-to-code generation lack the logical organization and flexibility needed to produce accurate, editable diagrams, often resulting in outputs that are either unstructured or difficult to modify. To address this gap, we introduce DiagramGenBenchmark, a comprehensive evaluation framework encompassing eight distinct diagram categories, including flowcharts, model architecture diagrams, and mind maps. Additionally, we present DiagramAgent, an innovative framework with four core modules-Plan Agent, Code Agent, Check Agent, and Diagram-to-Code Agent-designed to facilitate both the generation and refinement of complex diagrams. Our extensive experiments, which combine objective metrics with human evaluations, demonstrate that DiagramAgent significantly outperforms existing baseline models in terms of accuracy, structural coherence, and modifiability. This work not only establishes a foundational benchmark for the text-to-diagram generation task but also introduces a powerful toolset to advance research and applications in this emerging area.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Jingxuan Wei (21 papers)
  2. Cheng Tan (140 papers)
  3. Qi Chen (194 papers)
  4. Gaowei Wu (4 papers)
  5. Siyuan Li (140 papers)
  6. Zhangyang Gao (58 papers)
  7. Linzhuang Sun (18 papers)
  8. Bihui Yu (16 papers)
  9. Ruifeng Guo (10 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com