Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Text-to-CadQuery: A New Paradigm for CAD Generation with Scalable Large Model Capabilities (2505.06507v1)

Published 10 May 2025 in cs.AI, cs.CV, and cs.LG

Abstract: Computer-aided design (CAD) is fundamental to modern engineering and manufacturing, but creating CAD models still requires expert knowledge and specialized software. Recent advances in LLMs open up the possibility of generative CAD, where natural language is directly translated into parametric 3D models. However, most existing methods generate task-specific command sequences that pretrained models cannot directly handle. These sequences must be converted into CAD representations such as CAD vectors before a 3D model can be produced, which requires training models from scratch and adds unnecessary complexity. To tackle this issue, we propose generating CadQuery code directly from text, leveraging the strengths of pretrained LLMs to produce 3D models without intermediate representations, using this Python-based scripting language. Since LLMs already excel at Python generation and spatial reasoning, fine-tuning them on Text-to-CadQuery data proves highly effective. Given that these capabilities typically improve with scale, we hypothesize that larger models will perform better after fine-tuning. To enable this, we augment the Text2CAD dataset with 170,000 CadQuery annotations. We fine-tune six open-source LLMs of varying sizes and observe consistent improvements. Our best model achieves a top-1 exact match of 69.3%, up from 58.8%, and reduces Chamfer Distance by 48.6%. Project page: https://github.com/Text-to-CadQuery/Text-to-CadQuery.

Summary

Text-to-CadQuery: A New Paradigm for CAD Generation

This paper introduces a paradigm shift in the generation of CAD models by utilizing LLMs to translate natural language descriptions directly into CadQuery code, eliminating the need for intermediate command sequences commonly used in traditional methods. The approach leverages the proficiency of LLMs in Python generation and spatial reasoning, which are instrumental capabilities in the domain of computational geometry and CAD modeling.

CadQuery, a pure Python-based scripting language, allows for the creation of parametric 3D models without the dependency on external software environments. Its syntax aligns well with LLM capabilities, simplifying the generational pipeline and offering improved performance benchmarks. The paper reports substantial enhancements in model accuracy and execution reliability, demonstrating the efficacy of CadQuery as a direct target for LLM outputs.

Key Contributions

  1. Direct CadQuery Code Generation: The method bypasses intermediate command sequences by generating executable CadQuery code through harnessing the capabilities of pretrained LLMs. This not only streamlines the CAD creation process but enhances the quality of the models produced by these systems due to reduced complexity and increased alignment with the strengths of LLMs.
  2. Augmented Dataset: A new dataset composed of 170,000 text-CadQuery pairs was curated to enhance training and fine-tuning operations. This dataset extension, derived from prior DeepCAD and Text2CAD datasets, facilitates robust model training, pushing the boundaries of CAD generation within AI environments.
  3. Comprehensive Evaluation: Across six pretrained models ranging from 124M to 7B parameters, results underscored a positive correlation between model size and generation accuracy. Specifically, the largest models demonstrated superior performance, even though the increased computational complexity did not proportionally raise accuracy in some instances, pointing to potential underfitting scenarios with limited data.

Numerical and Performance Outcomes

The paper reports a top-1 exact match performance of 69.3%, representing a significant improvement over traditional methods that rely on command sequence generation. Furthermore, the approach reduces Chamfer Distance by 48.6%, indicating enhanced geometric fidelity of the produced models. The analysis further delineates improvements in F1 score and Volumetric IoU metrics, asserting the superiority of the methodology in retaining the precision of 3D geometric representations.

Implications and Future Directions

The introduction of CadQuery as an output format in LLM-driven CAD generation heralds transformative potential in both practical and theoretical domains. Practically, it simplifies integration within existing design pipelines, fosters interoperability, and encourages broader application among non-experts due to its reduced complexity and accessibility. Theoretically, it paves the way for further exploration of model scaling laws in generative modeling and enriches discussions surrounding the adaptation of LLMs for specialized applications.

Future developments could explore the expansion of dataset size and diversity to mitigate underfitting risks observed in larger models. Moreover, extending CadQuery-based generation to alternative input modalities, including image and point cloud formats, could expand the applicability and robustness of the method. As AI continues to evolve, the refined capabilities of future LLMs may offer enhanced multimodal understanding and a deeper integration of symbolic, spatially aware generation tasks across varied industrial and creative applications.

Github Logo Streamline Icon: https://streamlinehq.com