SketchDNN: Diffusion for CAD Sketches
- SketchDNN is a generative model that unifies continuous parameter and discrete label modeling to synthesize accurate CAD sketches.
- It introduces a Gaussian-Softmax diffusion process that blends noisy continuous data with class probabilities for improved sketch generation.
- The architecture achieves state-of-the-art performance on the SketchGraphs benchmark, outperforming previous methods in FID and NLL evaluations.
SketchDNN is a generative deep learning model designed for the synthesis of computer-aided design (CAD) sketches, particularly addressing the challenges of joint modeling of continuous primitive parameters and discrete class labels encountered in CAD vector graphics. The architecture introduces a novel Gaussian-Softmax diffusion process to enable high-fidelity generation, blending discrete and continuous variable modeling in a unified diffusion framework. SketchDNN achieves new state-of-the-art performance on the SketchGraphs benchmark, delivering substantial improvements in the quantitative metrics used for generative sketch quality evaluation (Chereddy et al., 15 Jul 2025).
1. Overview and Motivation
The principal motivation for SketchDNN arises from automating the initial phase of CAD workflows—the creation of 2D sketches from geometric primitives such as lines, arcs, circles, and points. Traditional CAD design requires significant manual input for constructing such sketches, which serve as the foundation for subsequent 3D modeling and shape parameterization. By automating this stage, SketchDNN aims to accelerate design productivity and facilitate the exploration of diverse design alternatives. The model specifically targets problems faced by prior autoregressive generative approaches: the heterogeneity of parameterizations across different primitives and the permutation invariance inherent to the unordered set of primitives in a sketch. SketchDNN’s generative approach is shown to surpass existing methods in handling these structural intricacies.
2. Gaussian-Softmax Diffusion Process
A key innovation in SketchDNN is the Gaussian-Softmax diffusion, which acts as the backbone of its generative mechanism for discrete variables:
- Forward Diffusion: Starting from a one-hot representation in logit space, class logits are perturbed with Gaussian noise. The perturbed values are then projected onto the probability simplex with a softmax operation:
where is Gaussian noise, and is the stepwise noise-schedule parameter.
- Reverse Process: The denoising network is trained to reconstruct the clean logits from the noisy softmax vector, interpolating between the noisy observation and the predicted logits before applying softmax again. This procedure applies to the discrete class representation of each primitive, and continuous parameters are diffused in their native space.
- This process enables a "superposition" of class labels at every step: rather than limiting transitions to a discrete class-to-class mapping, the probability vector can encode a blend, retaining information about all possible classes and improving the learning of class assignments as noise is removed.
3. Addressing Core CAD Sketch Generation Challenges
SketchDNN directly addresses two challenges central to CAD sketch generation:
a. Heterogeneity of Primitive Parameterization
Distinct CAD primitives require different parameterizations; for example, a line is defined by two endpoints, while a circle needs a center and a radius. SketchDNN employs a composite representation for each primitive: a one-hot flag for construction aids, a one-hot class vector (including a "None" type), and a concatenated set of all possible primitive parameters. This "superposition" approach enables the network to uniformly process primitives with different underlying structures. At inference, the primitive type is determined by the maximal probability in the class vector, and only the corresponding parameter subset is interpreted.
b. Permutation Invariance of Primitives
In CAD sketches, the final geometry is invariant to the ordering of primitives. Most prior generative models (e.g., autoregressive transformers) produce sequences of primitives, unintentionally introducing artificial dependencies. SketchDNN performs diffusion and denoising independently for each primitive, and crucially, its transformer-based denoising architecture omits positional encodings, rendering it permutation equivariant by construction. This ensures generation is order-agnostic and structurally faithful to the real-world design process.
4. Quantitative Performance Evaluation
Empirical evaluation on the SketchGraphs benchmark demonstrates substantial improvements:
Metric | Vitruvion (prev. SOTA) | SketchDNN |
---|---|---|
FID | 16.04 | 7.80 |
NLL(bits/sketch) | 84.8 | 81.33 |
- Fréchet Inception Distance (FID): Measures similarity between distributions of generated and real (rendered) sketches; lower is better. SketchDNN’s FID of 7.80 demonstrates advances in producing plausible, high-quality sketches compared to the prior leading method Vitruvion.
- Negative Log-likelihood (NLL): Measured in bits per sketch, indicating generative model likelihood; again, lower is better. The model’s NLL outperforms previous models, with ablation studies showing that reverting to standard categorical diffusion results in drastic deterioration (NLL 106.10; FID > 148), emphasizing the efficacy of Gaussian-Softmax diffusion.
5. Comparison with Previous Generative Approaches
SketchDNN distinguishes itself from previous models on several fronts:
- Joint Continuous-Discrete Modeling: Unlike pure autoregressive or categorical diffusion approaches, SketchDNN models continuous primitive parameters and discrete class labels within a single diffusion pipeline.
- Superposition Capability: The Gaussian-Softmax mechanism supports blended representations rather than only hard class assignments, which standard categorical mechanisms cannot provide.
- Permutation Equivariance: By acting independently on primitives and utilizing a transformer without positional encoding, SketchDNN avoids introducing artificial sequence biases.
These methodological advantages yield empirical improvements, with ablation demonstrating that each component—the Gaussian-Softmax approach and permutation-invariant network—contributes uniquely to overall performance.
6. Applications and Broader Implications
The architecture is immediately applicable to several CAD and design automation tasks:
- Automated Preliminary Sketch Generation: Rapidly proposes structural sketches for 3D modeling, accelerating iterative design workflows.
- Design Variant Exploration: By sampling diverse sketch generations, designers can efficiently explore alternatives and optimize designs.
- Integrating with Constraint-Based Solvers: While current work centers on pure geometry, extending SketchDNN to condition on explicit constraints could tightly couple generative design and engineering requirements.
- Adaptability to Other Domains: The unified continuous-discrete diffusion process extends naturally to generative tasks in circuit design, UI layout, and other domains where elements combine continuous and categorical parameters.
Future work is likely to focus on improving the reverse diffusion process, incorporating conditional generation, and tailoring the framework to further downstream design and engineering tasks, such as direct integration with 3D model generation pipelines or engineering analysis.
7. Summary
SketchDNN introduces a principled generative framework for CAD sketch synthesis that advances both conceptual modeling and empirical performance. Its Gaussian-Softmax diffusion mechanism, composite superposition encoding, and permutation-invariant generation reconcile long-standing challenges in discrete-continuous generative modeling for structured vector data. These advances are quantitatively validated on standard benchmarks, and the approach positions itself as a foundation for further innovation in data-driven CAD, design automation, and mixed-data generative modeling (Chereddy et al., 15 Jul 2025).