Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 42 tok/s Pro
GPT-5 Medium 13 tok/s
GPT-5 High 17 tok/s Pro
GPT-4o 101 tok/s
GPT OSS 120B 460 tok/s Pro
Kimi K2 198 tok/s Pro
2000 character limit reached

SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation (2506.03139v1)

Published 3 Jun 2025 in cs.CV and cs.AI

Abstract: LLMs and Multimodal LLMs have shown promising capabilities for SVG processing, yet existing benchmarks suffer from limited real-world coverage, lack of complexity stratification, and fragmented evaluation paradigms. We introduce SVGenius, a comprehensive benchmark comprising 2,377 queries across three progressive dimensions: understanding, editing, and generation. Built on real-world data from 24 application domains with systematic complexity stratification, SVGenius evaluates models through 8 task categories and 18 metrics. We assess 22 mainstream models spanning different scales, architectures, training paradigms, and accessibility levels. Our analysis reveals that while proprietary models significantly outperform open-source counterparts, all models exhibit systematic performance degradation with increasing complexity, indicating fundamental limitations in current approaches; however, reasoning-enhanced training proves more effective than pure scaling for overcoming these limitations, though style transfer remains the most challenging capability across all model types. SVGenius establishes the first systematic evaluation framework for SVG processing, providing crucial insights for developing more capable vector graphics models and advancing automated graphic design applications. Appendix and supplementary materials (including all data and code) are available at https://zju-real.github.io/SVGenius.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper introduces SVGenius, a benchmark that systematically evaluates LLMs and MLLMs across 8 SVG task categories using 18 metrics.
  • It evaluates 22 mainstream models, showing that proprietary systems excel in understanding and generation while reasoning-enhanced open-source models improve editing tasks.
  • The study reveals persistent challenges in SVG style transfer and underscores the need for advanced training strategies to enhance vector graphic processing.

SVGenius: Establishing a Comprehensive Benchmark for LLMs in SVG Processing

The paper "SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation" by Siqi Chen et al. presents a meticulously designed benchmark for evaluating the SVG processing capabilities of LLMs and multimodal LLMs (MLLMs). The paper introduces SVGenius as a novel framework that systematically addresses existing shortcomings in SVG benchmarks, such as limited real-world data coverage, lack of complexity stratification, and fragmented evaluations. It aims to provide a comprehensive assessment of SVG processing across understanding, editing, and generation dimensions.

Key Contributions

  1. Benchmark Design and Scope: SVGenius encompasses 2,377 queries constructed from real-world data spanning 24 domains, organized through a structured complexity stratification. This benchmark evaluates models across eight task categories and 18 metrics, focusing on SVG understanding (semantic and perceptual), editing (bug fixing, code optimization, style editing), and generation capabilities (text-to-SVG, image-to-SVG, style transfer).
  2. Model Evaluation: The paper assesses 22 mainstream models, including both proprietary and open source ones. Proprietary models, while outperforming open-source counterparts, demonstrate performance degradation with increasing complexity. Conversely, reasoning-enhanced training approaches in open-source models show potential in closing performance gaps, albeit with varying degrees of success across tasks.
  3. Comprehensive Capability Insights: The findings emphasize that fundamental limitations persist in current LLM approaches to handling SVG complexity. Specifically, style transfer remains notably challenging, highlighting a significant gap in both proprietary and open-source models.
  4. Experimental Results: The empirical evaluations exhibit the superior performance of proprietary models like Claude-3.7-Sonnet in understanding and generation tasks. However, models from reasoning-enhanced families such as DeepSeek-R1 show promising results in editing tasks, suggesting that non-scalable training approaches can contribute positively to model performance.

Implications and Future Directions

The systematic nature of SVGenius facilitates in-depth analysis and comparison of models' SVG processing capabilities, offering insights into their strengths and weaknesses. This benchmark paves the way for advances in developing more capable techniques for automated vector graphic design. The findings encourage further exploration into specialized training strategies, reasoning-enhanced techniques, and structural understanding methods to overcome inherent challenges in SVG processing.

Future AI developments could benefit from integrating sophisticated SVG capabilities, offering enhanced tools for designers and industries reliant on vector graphics. The spotlight on style transfer difficulties also serves as a focal area for future research endeavors. The SVGenius benchmark not only establishes a robust foundation for SVG processing evaluation but also contributes substantial progress toward realizing efficient, scalable, and design-oriented vector graphic solutions within AI systems.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com