Papers
Topics
Authors
Recent
Search
2000 character limit reached

GeoFusion-CAD: Hierarchical Generative CAD Modeling

Updated 29 March 2026
  • GeoFusion-CAD is a structure-aware framework for generative modeling of parametric CAD programs using hierarchical encoding and geometry-conditioned state-space diffusion.
  • It employs a recursive geometry-and-topology dual representation to accurately generate long sketch–extrusion sequences with high fidelity.
  • Benchmarking on DeepCAD-240 shows state-of-the-art performance with superior command accuracy and reduced resource usage compared to transformer-based methods.

GeoFusion-CAD is a structure-aware computational framework for hierarchical, geometry-conditioned representation and generative modeling of parametric Computer-Aided Design (CAD) programs. It is designed to address scalability, geometric fidelity, and topological consistency in the generation of long and complex sketch–extrusion command sequences, notably surpassing transformer-based approaches in both accuracy and resource efficiency for next-generation parametric 3D design (Zhou et al., 23 Mar 2026).

1. Hierarchical Encoding of Parametric CAD Programs

GeoFusion-CAD formalizes CAD programs as explicit hierarchical trees capturing both geometric and topological information. The canonical focus is on Sketch–Extrusion pipelines, a dominant class in parametric design:

  • The encoding is rooted at a “solid” node, with child sketch nodes corresponding to individual 2D profiles. Each sketch is recursively decomposed into face-level loops, each loop into edge-level curves (line, arc, circle), and each curve into vertex tokens.
  • Extrusion nodes interleave with sketches, binding 2D geometry to 3D volumetric operations.
  • The hierarchy is serialized in depth-first order. Special end-of-unit tokens ({eₛketch, e_face, e_loop, e_curve, e_extrude}) explicitly mark the closure of semantic units; this mitigates ambiguity in sequence generation and preserves program completeness.
  • Every node carries geometric parameters (e.g., planar coordinates px,pyp_x, p_y, curvature α\alpha, extrusion distances d+,dd_+, d_-, Euler angles θ,ϕ,γ\theta, \phi, \gamma, translation offsets τx,τy,τz\tau_x, \tau_y, \tau_z), as well as structured topological context: depth in the tree, parent node type, and local sibling indices.

This explicit, recursive, geometry-and-topology dual representation is foundational for scalable generative modeling of large, structured CAD programs (Zhou et al., 23 Mar 2026).

2. Geometric State-Space Diffusion Process

GeoFusion-CAD introduces a geometric state-space diffusion (G-Mamba) framework, eliminating standard quadratic-cost attention mechanisms in favor of efficient, geometry-conditioned state transitions:

q(ZtZ0)=N(Zt;αˉtZ0,(1αˉt)I),Zt=αˉtZ0+1αˉtϵt,q(Z_t | Z_0) = \mathcal{N}(Z_t; \sqrt{\bar\alpha_t} Z_0, (1-\bar\alpha_t) I), \qquad Z_t = \sqrt{\bar\alpha_t} Z_0 + \sqrt{1-\bar\alpha_t} \epsilon_t,

with ϵtN(0,I)\epsilon_t \sim \mathcal{N}(0, I) and αˉt=i=1tαi\bar\alpha_t = \prod_{i=1}^t \alpha_i.

  • Reverse Diffusion (Denoising): The learned reverse transition kernel is

pθ(Zt1Zt)=N(Zt1;μθ(Zt,t),Σθ(Zt,t)),p_\theta(Z_{t-1}|Z_t) = \mathcal{N}(Z_{t-1}; \mu_\theta(Z_t, t), \Sigma_\theta(Z_t, t)),

with μθ,Σθ\mu_\theta, \Sigma_\theta architecture-defined.

  • State Transitions: At each sequence token kk, local geometry Δk\Delta_k (scale, depth, curvature) and topological embedding Πk\Pi_k are input to a small MLP, computing parameters {Aˉk,Bˉk,Ck,Gk}\{\bar{A}_k, \bar{B}_k, C_k, G_k\} for the state-space updates:

hk+1=Aˉkhk+BˉkZkc, Zk+1c=Ckhk+GkZkc.\begin{aligned} h_{k+1} &= \bar{A}_k h_k + \bar{B}_k Z^c_k, \ Z^c_{k+1} &= C_k h_k + G_k Z^c_k. \end{aligned}

Here ZkcZ^c_k is the denoised CAD feature, hkh_k is the latent memory.

  • Loss Function: The total loss jointly optimizes diffusion reconstruction and CAD command/parameter prediction:

Ltotal=Et,Z0,ϵϵtϵθ(Zt,t)2+i=1N[CCA(c^i,ci)+ηj=1MACE(a^i,j,ai,j)],\mathcal{L}_{\rm total} = \mathbb{E}_{t,Z_0,\epsilon} \| \epsilon_t - \epsilon_\theta(Z_t, t) \|^2 + \sum_{i=1}^N [CCA(\hat{c}_i, c_i) + \eta \sum_{j=1}^M ACE(\hat{a}_{i, j}, a_{i, j})],

where CCACCA and ACEACE are cross-entropy losses for commands and arguments.

By combining linear-complexity (O(Ld)O(Ld)) state-space diffusion with explicit geometry/topology conditioning, GeoFusion-CAD enables faithful long-sequence generation where transformers degrade (Zhou et al., 23 Mar 2026).

3. G-Mamba Block Architecture and Information Propagation

GeoFusion-CAD's backbone is the G-Mamba (“C-Mamba”) block, which unifies convolutional locality and geometry-aware global context:

  • Depthwise Convolution (DWConv): Injects spatial/adjacency bias, enhancing inductive structure for local neighborhoods.
  • Geometry-Conditioned State-Space Layer (GSM-SSD): Inputs local geometric features (Δk,Πk\Delta_k, \Pi_k), produces state transition parameters. Learned transitions admit selective, context-dependent information routing.
  • Gated Hadamard Mixing: Implements a hybrid of direct-state interaction and nonlinear gating:

hin=(AˉkBˉk)Z^kc,[h,z]=Linear(hin),h^=Linear(hσ(z))h_{\rm in} = (\bar{A}_k \odot \bar{B}_k)^\top \hat{Z}^c_k, \quad [h, z] = \mathrm{Linear}(h_{\rm in}), \quad \hat{h} = \mathrm{Linear}(h \odot \sigma(z))

This yields fine-grained programmability across hierarchical boundaries (e.g., sketch–extrude interface, curve–vertex transition).

  • Normalization and Modulation: Both RMSNorm and FiLM are supported to modulate states by diffusion timestep tt, stabilizing training and allowing conditional adaptation.

The block architecture is engineered to propagate signals efficiently across CAD sequences of arbitrary length, specifically privileging child–parent and high-curvature regions typically responsible for critical topological and geometric dependencies (Zhou et al., 23 Mar 2026).

4. Large-Scale Benchmarking: DeepCAD-240

A new benchmark, DeepCAD-240, extends evaluation to regime-lengths that challenge transformer memory:

  • Source: Constructed from the ABC dataset via procedural extraction of full Sketch–Extrusion histories.
  • Statistics: 215,000 models, mean program length 36, up to 240 commands. Length stratified for comprehensive stress-testing (76.6%: ≤40 commands, 12%: 41–60, 5.9%: 61–80, 5.2%: 81–160, 0.21%: 161–240).
  • Protocols: Two regimes—short (≤60 commands, legacy DeepCAD) and long (40–240, DeepCAD-240)—with full preservation of semantic and closure tokens.
  • Metrics: Metrics include command/parameter accuracy (ACC_cmd/ACC_param), primitive-level ACC, geometric coverage (COV), minimum-matching distance (MMD), Jensen-Shannon divergence (JSD), and resource usage (GPU memory, FLOPs).

This benchmark enables rigorous comparative assessment in both routine and extreme generative settings (Zhou et al., 23 Mar 2026).

5. Empirical Performance and Resource Comparison

Experimental results position GeoFusion-CAD as state-of-the-art in both accuracy and efficiency on the DeepCAD(-240) testbed:

Test Range Model ACC_cmd ACC_param COV MMD JSD Mem FLOPs
DeepCAD HNC-CAD 95.4 93.8 82.3 1.33 3.24
Ours 99.3 97.6 85.6 0.95 2.51
DeepCAD-240 HNC-CAD 82.8 78.5 71.2 1.71 3.81 10.3GB 87.3G
Ours 91.2 89.3 73.9 1.12 2.97 5.2GB 34.6G

Key findings:

  • On short sequences, GeoFusion-CAD uniformly outperforms transformers across all task-specific and geometric metrics.
  • On long sequences (up to 240 commands), transformer-based models demonstrate steep degradation; GeoFusion-CAD's command accuracy (91.2%) and geometric error (MMD=1.12) remain stable.
  • Computational resource usage is significantly lower: memory and FLOPs are approximately halved compared to HNC-CAD.
  • Under rare, high-complexity topologies (length >160), performance margins decrease but state-space diffusion architecture continues to yield stable, recoverable output.

This empirically confirms the architectural design principle: explicit state-space construction with geometry-and-topology-encoded transitions is highly scalable and robust for generative CAD tasks (Zhou et al., 23 Mar 2026).

6. Design Insights, Limitations, and Future Extensions

Key Insights:

  • Structure-aware, linear-time state-space diffusion exceeds traditional self-attention on highly structured, long token streams.
  • Hierarchical encoding (solid→sketch→face→edge→vertex) with explicit closure tokens is critical for both geometric fidelity and topological correctness.
  • Geometry-conditioned kernels adaptively route information, balancing long-range structural dependencies and localized geometric features.

Limitations:

  • Current instantiation is confined to single-body Sketch–Extrusion, not handling assemblies or advanced operations (e.g., sweep, loft, fillet, revolve).
  • Extreme-length sequences (rare topologies, >160 commands) are underrepresented; error accumulation can cause minor geometric artifacts.
  • Multi-component assemblies and inter-part constraints are out of scope.

Future Directions:

  • Extending the hierarchical program representation to multi-body assemblies, supporting inter-component constraints (mates, joints) and broader CAD operator vocabularies.
  • Pre-training on mixed-paradigm CAD datasets to develop foundation diffusion models with broader generalizability.
  • Hybrid architectures with sparse attention overlays could further enhance fine-detail context propagation for ultra-long CAD programs.
  • Augmenting rare-topology training data and dynamic curriculum scheduling to mitigate long-tail error accumulation.

Together, these directions aim to further scale the generative capacity, flexibility, and reliability of the GeoFusion-CAD paradigm (Zhou et al., 23 Mar 2026).


In sum, GeoFusion-CAD advances the state of the art in structure-aware, efficient generative modeling of parametric CAD programs via geometry- and topology-encoded hierarchical state-space diffusion. The framework demonstrates scalability to 240+ command sequences and robust performance in benchmarks where standard transformer models fail, establishing a foundational methodology for future research in automated CAD synthesis and design (Zhou et al., 23 Mar 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to GeoFusion-CAD.