Papers
Topics
Authors
Recent
Search
2000 character limit reached

DeepCAD-240: CAD Benchmark for Long Sequences

Updated 29 March 2026
  • DeepCAD-240 is a large-scale benchmark dataset for long-form parametric CAD sequence generation, featuring up to 240 sketch–extrusion operations.
  • It employs a methodology that mines the ABC dataset with custom FeatureScript routines and advanced tokenization to ensure semantic and syntactic consistency.
  • Comparative analysis reveals that models like GeoFusion-CAD, using geometric state space diffusion, deliver superior accuracy and efficiency on complex CAD tasks.

DeepCAD-240 is a large-scale benchmark dataset and evaluation protocol for long-form parametric Computer-Aided Design (CAD) sequence generation, introduced to stress-test generative models on command sequences ranging from moderate to extreme length (up to 240 sketch–extrusion operations). It builds upon and extends the DeepCAD line of benchmarks to provide a quantifiable, rigorous testbed for scalable, structure-aware CAD generation, particularly under hierarchical and long-range dependencies prevalent in industrial 3D modeling tasks (Zhou et al., 23 Mar 2026).

1. Construction and Structure of DeepCAD-240

DeepCAD-240 is constructed by mining the ABC dataset—containing over one million Constructive Solid Geometry (CSG) models—via the Onshape API with custom FeatureScript routines. Each CSG primitive or Boolean tree is algorithmically transformed into an explicit stepwise “sketch–extrusion” history. For each CAD program in DeepCAD-240:

  • Sketch Extraction: 2D profiles are tokenized as ordered sequences of primitive types (lines, arcs, circles), with each curve/loop/face/sketch termination marked using explicit tokens (e.g., ece_c, ele_l, efe_f, ese_s).
  • Extrusion Parameterization: Each extrusion step is parameterized using Euler angles (θ,ϕ,γ)(\theta, \phi, \gamma), translations (τx,τy,τz)(\tau_x, \tau_y, \tau_z), scale σ\sigma, distances (d+,d)(d_+, d_-), and Boolean operation type β{new,cut,join,intersect}\beta \in \{\mathrm{new}, \mathrm{cut}, \mathrm{join}, \mathrm{intersect}\}, finalized with eee_e.
  • Filtering: Programs are retained if sketches form at least one closed loop, reconstruct as watertight solids, and contain no invalid/redundant operations. This rigorous filtration enforces semantic and syntactic consistency.

A summary of overall dataset statistics:

Dataset Total Sequences Avg. Length Max. Length % ≤ 40 % 40–60 % 60–80 % 80–160 % 160–240
DeepCAD 178,238 15 60 44.6 55.4
DeepCAD-240 215,914 36.2 240 76.6 12.0 5.9 5.2 0.21

Table 1. Sequence length statistics in DeepCAD-240 (Zhou et al., 23 Mar 2026).

Programs span a wide range of categories (mechanical, free-form, assembly), and each design step is comprised of a (potentially multi-token) sketch block and an extrusion block, yielding fine-grained structural and parametric expressivity.

2. Tokenization, Vocabulary, and Command Semantics

The DeepCAD-240 token vocabulary unifies structural, sketch, and extrusion abstractions:

  • Structural tokens: {pad, cls, ese_s, efe_f, ele_l, ece_c, eee_e}
  • Sketch parameters: (px,py)(p_x, p_y) (coordinates), α\alpha (arc curvature), ff (flip), rr (circle radius)
  • Extrusion parameters: d+d_+, dd_- (distances), τx,τy,τz\tau_x, \tau_y, \tau_z (translations), θ,ϕ,γ\theta, \phi, \gamma (Euler angles), σ\sigma (scale), β\beta (Boolean op type)

Each sketch–extrusion step typically yields 15–40 tokens. On average, approximately half are sketch-primitives and half are extrusion parameterizations. The explicit separation of planarity, curvature, topological, and Boolean semantics supports compositional hierarchies and long-term dependencies, modelling the latent logic of human CAD design.

3. Evaluation Protocols and Metrics

DeepCAD-240 assesses candidate generative models through both procedural and geometric distribution metrics:

  • Procedural accuracy:
    • Command Type Accuracy

    ACCcmd=1Ni=1N1[c^i=ci]\mathrm{ACC}_{\mathrm{cmd}} = \frac{1}{N} \sum_{i=1}^{N} \mathbf{1} [\hat{c}_i = c_i] - Parameter Accuracy

    ACCparam=1iMii=1Nj=1Mi1[a^i,j=ai,j]\mathrm{ACC}_{\mathrm{param}} = \frac{1}{\sum_i M_i} \sum_{i=1}^N \sum_{j=1}^{M_i} \mathbf{1} [\hat{a}_{i,j} = a_{i,j}] - Primitive-type Accuracies: (lines, arcs, circles, extrusions), each computed analogously.

  • Geometric/distribution metrics:

    • Chamfer Distance dC(G,T)d_C(G, T)
    • Minimum Matching Distance (MMD)
    • Coverage (COV)
    • Jensen–Shannon Divergence (JSD) between voxelized distributions

Hausdorff distance and IoU are not reported explicitly; topological validity is gauged via procedural accuracies and watertightness of B-reps (Zhou et al., 23 Mar 2026).

4. Comparative Analysis and Model Baselines

DeepCAD-240’s expanded length and complexity directly expose the scalability limitations of Transformer-based and recurrent generative CAD models developed for prior benchmarks:

Model ACC_cmd ↑ ACC_param ↑ COV ↑ MMD ↓ JSD ↓ Memory FLOPs
DeepCAD 75.2 72.5 64.5 1.85 4.09 8.20GB 52.8G
SkexGen 81.4 78.3 68.9 1.78 3.97 11.2GB 91.2G
HNC-CAD 82.8 78.5 71.2 1.71 3.81 10.3GB 87.3G
GeoFusion-CAD 91.2 89.3 73.9 1.12 2.97 5.20GB 34.6G

Table 2. Model comparison under DeepCAD-240 test range (40–240 commands) (Zhou et al., 23 Mar 2026).

  • Transformer models (DeepCAD, SkexGen, HNC-CAD) show rapid degradation as sequence length increases (15–20 point drop in command accuracy, lower coverage).
  • GeoFusion-CAD employs a geometric state space diffusion framework with linear-time C-Mamba blocks, sustaining high accuracy (91.2%), superior coverage, and substantially reduced memory/FLOPs footprint on long-sequence tasks.

5. Historical Context and Relation to Prior Benchmarks

The original DeepCAD benchmark introduced a fixed-length (N=60) tokenization and transformer-based generative architecture for 3D sketch–extrude CAD programs (Wu et al., 2021). Each model in DeepCAD represents a sketch–extrusion history tokenized into commands (types: SOL, L, A, R, E, EOS), with 16-parameter vectors, and is normalized, quantized, and padded for transformer processing.

DeepCAD-240 extends this paradigm to:

  • Maximum program length of 240 (4× extension).
  • Higher average and median command sequence lengths (mean 36.2, median ~25).
  • Wider support for hierarchical, nested, and structurally long CAD programs—capturing intricate dependencies across broader semantic and geometric contexts.

A direct implication is that long-range consistency, hierarchical structural modeling, and efficient memory- and compute-scalable architectures are essential for performance on DeepCAD-240 and related real-world industrial use-cases.

6. Impact on Model Development and Benchmarks

By providing a benchmark with true long-tail sequence distribution and explicit sketch–extrude semantic structure, DeepCAD-240 enables:

  • Precise disambiguation of generative failures arising from lost context, memory saturation, or architectural inefficiency.
  • Quantitative comparison of next-generation models that move beyond transformer architectures, e.g., diffusion in geometric state space (GeoFusion-CAD), Mamba blocks, etc.
  • Rigorous evaluation of command, parameter, and geometric output fidelity in a realistic industrial setting.

Empirical results demonstrate that advances at the architectural level (e.g., GeoFusion-CAD) yield marked gains in both performance and efficiency over transformers, especially as sequence length, hierarchy, and semantic complexity scale (Zhou et al., 23 Mar 2026).

7. Significance and Future Directions

DeepCAD-240 formalizes the challenge of long-sequence, structure-aware, and topologically complex CAD program generation. It highlights the necessity for:

  • Scalable architectures capable of maintaining long-range procedural and geometric consistency.
  • Richer tokenization and semantic abstractions aligned with real-world CAD modeling practices.
  • Comprehensive evaluation protocols capturing both procedural and geometric fidelity.

A plausible implication is that benchmarks such as DeepCAD-240 will become standard in the evaluation pipeline for neural program synthesis, reverse engineering, and foundation models targeting AI-aided design, 3D manufacturing, and engineering automation domains. Researchers are expected to leverage datasets of this scope to iterate on model architectures and training paradigms that generalize to even longer, more complex CAD programs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to DeepCAD-240.