CadQuery: Python Parametric CAD Scripting
- CadQuery is a Python-based parametric CAD framework that uses OpenCASCADE for robust, editable 3D modeling.
- It offers intuitive workplane-based modeling with operations like extrusion, revolution, Boolean cuts, and edge modifiers.
- CadQuery facilitates AI-driven design workflows by supporting neural code generation, direct script validation, and precise geometric benchmarking.
CadQuery is a Python-based parametric Computer-Aided Design (CAD) scripting framework that serves as both a modeling API and a target representation for recent advances in automated 3D model generation from language or vision input. Leveraging the OpenCASCADE boundary-representation (B-Rep) kernel for true solid modeling, CadQuery enables the programmatic construction of complex, editable, and highly parametric CAD models. Its concise, human-readable scripts map directly to common CAD design workflows and afford direct geometric validation, enabling both traditional CAD development and AI-driven code generation, simulation, and benchmarking.
1. Core Concepts and Design Principles
CadQuery is architected around the concept of “Workplanes”: 2D sketching planes (e.g., “XY”, “XZ”) that serve as the foundational context for subsequent geometric operations. Primitives such as circles, rectangles, and lines are sketched on these workplanes, followed by 3D operations like extrusion, revolution, sweep, and loft. Boolean operations (union, cut, intersect), edge/face modifiers (fillet, chamfer, shell), and geometric transformations (rotate, translate, mirror) constitute the bulk of CadQuery’s parametric modeling vocabulary (Doris et al., 20 May 2025, Guan et al., 26 May 2025).
CadQuery’s API is a thin Python wrapper over OpenCASCADE, exposing the B-Rep kernel’s capabilities while remaining idiomatic to Python. Scripts are modular and human-legible, built as fluent chains of method calls (often via .Workplane().primitive().operation()...), resulting in code that is readily interpretable, auditable, and modifiable by both humans and AI systems (Doris et al., 20 May 2025).
2. Role in Neural CAD Code Generation
CadQuery has become the de facto standard output format for neural text-to-CAD and vision-to-CAD systems due to several distinguishing factors:
- Executable validation: CadQuery scripts can be directly executed, with any syntactic or geometric errors detected at runtime; this enables strong supervision for model training (Xie et al., 10 May 2025, Guan et al., 26 May 2025).
- Rich parametric API: The available set of modeling primitives, transformations, and modifiers supports a superset of classical command-sequence CAD paradigms while allowing express parametric variation (Guan et al., 26 May 2025).
- Seamless integration with LLMs: As a Python-based language, CadQuery leverages code priors and spatial reasoning already present in pretrained LLMs, facilitating high-fidelity code generation and geometric output (Xie et al., 10 May 2025).
Consequently, CadQuery is the target representation for recent large-scale datasets and multimodal models, enabling natural language (or image) inputs to be mapped to scripts that render unambiguous 3D geometry. This is in contrast to approaches that output task-specific command sequences or require custom domain-specific languages (Xie et al., 10 May 2025).
3. Methodological Advances Leveraging CadQuery
Recent research has advanced the automated generation of CadQuery scripts via various neural pipelines:
- Two-stage learning: A common framework encompasses supervised fine-tuning (SFT) on paired natural language (NL) and CadQuery script data, followed by reinforcement learning (RL) to refine geometric and syntactic fidelity. For example, CAD-Coder employs Group Reward Policy Optimization (GRPO) and chain-of-thought (CoT) planning (Guan et al., 26 May 2025).
- Multimodal CoT+RL: CAD-RL integrates multimodal chain-of-thought prompting and RL with reward functions for executability, geometric accuracy (via Intersection-over-Union, IOU), and semantic alignment, employing optimization strategies such as Trust Region Stretch (TRS), Precision Token Loss, and Overlong Filtering (Niu et al., 13 Aug 2025).
The reward functions in these RL pipelines often combine three factors: executability (), geometric accuracy (, e.g., IOU or Chamfer Distance), and external evaluation via a LLM (), with the total reward,
serving as the training objective (Niu et al., 13 Aug 2025).
4. Datasets and Benchmarks
Several large-scale paired datasets have been constructed to catalyze and benchmark research on CadQuery-based generative modeling:
| Dataset | Samples | Input Modalities | Contains CadQuery? | Remarks |
|---|---|---|---|---|
| GenCAD-Code | 163,671 | CAD image + code | Yes | 5 images/script, used for VLM training (Doris et al., 20 May 2025) |
| Text2CAD | ≈170,000 | NL prompt, command seq | Via annotation | Source for most CadQuery pairings (Xie et al., 10 May 2025) |
| ExeCAD | 16,540 | NL, design spec, image | Yes | Expert-verified, aligned NL, design, code, mesh (Niu et al., 13 Aug 2025) |
| CAD-Coder | 110,000 | NL, code, mesh | Yes | Used for SFT and RL (CoT subset: 1.5k) (Guan et al., 26 May 2025) |
Dataset construction typically involves automatic or semi-automatic conversion from sketch/command representations to CadQuery code, systematic validation via script execution, mesh rendering, and point-cloud alignment (e.g., Chamfer Distance, IOU). Higher-quality samples receive further expert refinement (Xie et al., 10 May 2025, Niu et al., 13 Aug 2025).
5. Evaluation Metrics and Script Analysis
The predominant metrics for evaluating CadQuery-based CAD code generation are geometric, syntactic, and semantic in nature:
- Validity/Executability: Proportion of generated scripts that compile and run without Python or CadQuery errors (Valid Syntax Rate, VSR) (Doris et al., 20 May 2025).
- 3D Geometric Similarity: Chamfer Distance () and Intersection-over-Union (IOU) are standard measures comparing generated and ground-truth mesh point clouds:
IOU involves centroid normalization, principal axis alignment, and best-case volume overlap (Doris et al., 20 May 2025, Niu et al., 13 Aug 2025).
- Exact Match Rate: String equality between generated and reference CadQuery code.
- External LLM Judgement: Vision-LLMs may act as reference for semantic equivalence when geometric metrics are ambiguous (Niu et al., 13 Aug 2025).
Sample code analysis often reveals that modern approaches improve not only geometric accuracy but also syntactic robustness (e.g., explicit face selection, correct parameter formats, API coverage) compared to prior baselines. For instance, models trained with CoT and RL generate float literals and robust face selectors, correcting typical errors of SFT-only systems (Niu et al., 13 Aug 2025, Guan et al., 26 May 2025).
6. Representative Usage Patterns and API Mapping
CadQuery’s modeling vocabulary lends itself to direct mapping from common language tasks to Pythonic method calls, supporting the following patterns (Guan et al., 26 May 2025, Xie et al., 10 May 2025):
- Primitive creation:
.box(),.sphere(),.circle(),.rect() - 2D-to-3D operations:
.extrude(),.revolve(),.loft(),.sweep() - Boolean operations:
.union(),.cut(),.intersect() - Edge/face selection and modification:
.faces(">Z").workplane(),.edges("|Z").fillet(2),.chamfer(1.5) - Patterning:
.mirror(),.array(),.pushPoints()
Example: producing a 10×5×2 mm box with three equidistant 1mm holes:
1 2 3 4 5 6 |
import cadquery as cq box = cq.Workplane("XY").box(10, 5, 2) hole_pts = [(2.5,0), (5.0,0), (7.5,0)] faces = box.faces(">Z").workplane() box = faces.pushPoints(hole_pts).hole(1.0) cq.exporters.export(box, "box_with_holes.stl") |
Chain-of-thought planning structures can be embedded in prompts and outputs to enhance reasoning and clarify design decomposition (Guan et al., 26 May 2025, Niu et al., 13 Aug 2025).
7. Comparative Performance and Limitations
CadQuery-based systems have demonstrated significant gains in code validity and 3D geometric accuracy over previous approaches:
- VSR up to 100% and IOU=0.675 reported for CAD-Coder (Vicuna-13B) on the GenCAD-Code test set (Doris et al., 20 May 2025).
- Median Chamfer Distance as low as 0.17×10⁻³ and invalidity ratios under 2% reported for state-of-the-art pipelines versus prior IR >45% (Guan et al., 26 May 2025).
- Larger LLMs consistently produce higher-fidelity CadQuery code, but data scarcity for very large models may lead to underfitting. There remains sensitivity to ambiguous prompts and unsupported/rare CAD operations, such as advanced sweeps or fillets missing in the training corpus (Doris et al., 20 May 2025, Xie et al., 10 May 2025).
8. Outlook and Research Directions
CadQuery’s centrality to modern text-to-CAD and vision-to-CAD frameworks is underpinned by its expressive parametric API, robustness as an executable language, and compatibility with large pretrained models. Current research is exploring multimodal extensions (sketch + text, real-image + text), interactive refinement loops, RL with sophisticated reward shaping, enhanced fine-tuning methodologies for preserving API knowledge, and expansion to alternative Python-based solid modeling APIs (e.g., FreeCAD, PythonOCC) (Xie et al., 10 May 2025, Doris et al., 20 May 2025, Niu et al., 13 Aug 2025). A plausible implication is that future advancements in dataset quality, multimodal alignment, and reward engineering will increase the range and precision of CAD models generable by neural architectures targeting CadQuery.