- The paper presents a novel Transformer-based sequence-to-sequence framework that converts SVG vector drawings into parametric CAD command sequences.
- It achieves high sequence accuracy (up to 92.3% on synthetic benchmarks) and ensures geometric fidelity via combined cross-entropy and consistency losses.
- The method enables rapid design automation and reverse engineering by generating editable CAD models directly from legacy vectorized drawings.
Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vectorized Drawings
Introduction
The paper presents Drawing2CAD, a sequence-to-sequence learning framework for generating parametric CAD construction sequences directly from vectorized drawings, specifically SVG representations. The approach leverages the structural similarity between SVG drawing commands and CAD construction commands, enabling a unified modeling paradigm. This work addresses the challenge of translating 2D vector graphics into editable, parametric CAD models, which is critical for downstream engineering tasks such as design automation, reverse engineering, and digital manufacturing.
Methodology
Drawing2CAD formalizes both SVG and CAD construction processes as parametric command sequences. The core architecture is a Transformer-based encoder-decoder model, where the encoder ingests the SVG command sequence and the decoder outputs the corresponding CAD command sequence. The SVG input is tokenized into a sequence of parametric commands (e.g., M
, L
, C
for move, line, curve), each with associated geometric parameters. The output sequence consists of CAD construction commands (e.g., Line
, Circle
, Extrude
) with their respective parameters.
The model is trained on paired SVG-CAD datasets, where each SVG drawing is annotated with its corresponding CAD construction sequence. The loss function is a combination of sequence cross-entropy and geometric consistency losses, ensuring both syntactic and semantic fidelity in the generated CAD sequence. The authors employ data augmentation strategies to improve generalization, including random perturbations of SVG parameters and command orderings.
Experimental Results
Drawing2CAD is evaluated on multiple benchmarks, including synthetic datasets and real-world engineering drawings. The model achieves high sequence accuracy (up to 92.3% on synthetic benchmarks) and geometric reconstruction fidelity, outperforming prior methods such as Free2CAD and DeepSVG in both sequence prediction and downstream CAD model editability. The ablation studies demonstrate that the geometric consistency loss significantly improves the alignment between generated CAD models and ground-truth engineering intent.
The inference speed is competitive, with average sequence generation times under 0.5 seconds per drawing on a single NVIDIA A100 GPU. The model scales linearly with input sequence length, and memory consumption remains tractable for typical engineering drawing sizes (up to 500 SVG commands).
Implementation Considerations
For practical deployment, the authors provide a modular pipeline:
- SVG Preprocessing: Vectorized drawings are parsed and normalized using CairoSVG and custom Python scripts.
- Tokenization: SVG commands are mapped to a fixed vocabulary; geometric parameters are discretized or normalized.
- Model Training: The Transformer model is implemented in PyTorch, with support for distributed training and mixed precision.
- CAD Sequence Postprocessing: Generated command sequences are validated for syntactic correctness and fed into PythonOCC or FreeCAD for 3D model instantiation.
The framework supports batch inference and can be integrated into CAD automation workflows. Limitations include sensitivity to ambiguous or poorly vectorized input drawings and the need for high-quality paired SVG-CAD datasets for optimal performance.
Theoretical and Practical Implications
Drawing2CAD advances the state-of-the-art in CAD generative modeling by bridging the gap between vector graphics and parametric CAD construction. The sequence-to-sequence paradigm enables direct learning of design intent, facilitating editable and interpretable CAD model generation. This has significant implications for design automation, enabling rapid prototyping and reverse engineering from legacy drawings.
Theoretically, the work demonstrates that command sequence modeling is effective for multi-modal translation tasks in engineering design. The unified representation of SVG and CAD commands opens avenues for cross-domain transfer learning and multi-modal generative modeling.
Future Directions
Potential future developments include:
- Multi-modal Fusion: Integrating raster images, textual descriptions, and vector graphics for richer CAD model generation.
- 3D CAD Sequence Generation: Extending the framework to handle 3D construction sequences from multi-view or single-view vector inputs.
- Active Learning: Leveraging user feedback to refine sequence generation in interactive CAD environments.
- Domain Adaptation: Adapting the model to diverse engineering domains (e.g., architecture, mechanical, electrical) with minimal retraining.
Conclusion
Drawing2CAD provides a robust, scalable solution for generating parametric CAD models from vectorized drawings via sequence-to-sequence learning. The approach achieves strong numerical results in sequence accuracy and geometric fidelity, with practical utility for engineering design automation. The unified command sequence representation and Transformer-based architecture set a foundation for future research in multi-modal CAD generative modeling and cross-domain design translation.