CADTalk: An Algorithm and Benchmark for Semantic Commenting of CAD Programs (2311.16703v3)
Abstract: CAD programs are a popular way to compactly encode shapes as a sequence of operations that are easy to parametrically modify. However, without sufficient semantic comments and structure, such programs can be challenging to understand, let alone modify. We introduce the problem of semantic commenting CAD programs, wherein the goal is to segment the input program into code blocks corresponding to semantically meaningful shape parts and assign a semantic label to each block. We solve the problem by combining program parsing with visual-semantic analysis afforded by recent advances in foundational language and vision models. Specifically, by executing the input programs, we create shapes, which we use to generate conditional photorealistic images to make use of semantic annotators for such images. We then distill the information across the images and link back to the original programs to semantically comment on them. Additionally, we collected and annotated a benchmark dataset, CADTalk, consisting of 5,288 machine-made programs and 45 human-made programs with ground truth semantic comments. We extensively evaluated our approach, compared it to a GPT-based baseline, and an open-set shape segmentation baseline, and reported an 83.24% accuracy on the new CADTalk dataset. Code and data: https://enigma-li.github.io/CADTalk/.
- Few-shot training llms for project-specific code-summarization. In Proc. IEEE/ACM International Conference on Automated Software Engineering, 2023.
- Algot Runeman. OpenSCAD 3D Central. http://runeman.org/3d/, 2023.
- code2seq: Generating sequences from structured representations of code. In International Conference on Learning Representations (ICLR), 2019.
- Thingiverse: review and analysis of available files. International Journal of Rapid Manufacturing, 7(1):83–99, 2018.
- Language models are few-shot learners. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Texfusion: Synthesizing 3d textures with text-guided image diffusion models. In Proc. IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- Differentiable 3d cad programs for bidirectional editing. Computer Graphics Forum (Proc. EUROGRAPHICS), 41(2), 2022.
- Evaluating large language models trained on code, 2021.
- Inversecsg: Automatic conversion of 3d models to csg trees. ACM Transactions on Graphics (Proc. SIGGRAPH Asia), 37(6):1–16, 2018.
- Github Authors. Lark - a parsing toolkit for Python. https://github.com/lark-parser/lark, 2017.
- On the use of automated text summarization techniques for summarizing source code. In Proc. Working Conference on Reverse Engineering. IEEE Computer Society, 2010.
- Meshcnn: A network with an edge. ACM Transactions on Graphics (Proc. SIGGRAPH), 38(4), 2019.
- Image analysis using mathematical morphology. IEEE transactions on pattern analysis and machine intelligence, (4):532–550, 1987.
- Summarizing source code using a neural attention model. In Proc. Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2016.
- Shapeassembly: Learning to generate programs for 3d shape structure synthesis. ACM Transactions on Graphics (TOG), 39(6):1–20, 2020.
- Shapemod: macro operation discovery for 3d shape programs. ACM Transactions on Graphics (TOG), 40(4):1–16, 2021.
- Shapecoder: Discovering abstractions for visual programs from unstructured primitives. arXiv preprint arXiv:2305.05661, 2023.
- 3D shape segmentation with projective convolutional networks. In Proc. IEEE Computer Vision and Pattern Recognition (CVPR), 2017.
- Segment anything. arXiv preprint arXiv:2304.02643, 2023.
- Reparamcad: Zero-shot cad re-parameterization for interactive manipulation. In SIGGRAPH Asia (Conference track), 2023.
- Free2cad: Parsing freehand drawings into cad commands. ACM Transactions on Graphics (Proc. SIGGRAPH), 41(4), 2022.
- Grounded language-image pre-training. In Proc. IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2022.
- Open-vocabulary semantic segmentation with mask-adapted clip. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7061–7070, 2023.
- Partslip: Low-shot part segmentation for 3d point clouds via pretrained image-language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 21736–21746, 2023a.
- Grounding dino: Marrying dino with grounded pre-training for open-set object detection. arXiv preprint arXiv:2303.05499, 2023b.
- Marching-primitives: Shape abstraction from signed distance function. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8771–8780, 2023c.
- How can large language models help humans in design and manufacturing?, 2023.
- Marius Kintel. OpenSCAD. https://openscad.org/index.html, 2023.
- Dag amendment for inverse control of parametric shapes. ACM Transactions on Graphics (Proc. SIGGRAPH), 40(4), 2021.
- Partnet: A large-scale benchmark for fine-grained and hierarchical part-level 3d object understanding. In Proc. IEEE/CVF conference on computer vision and pattern recognition (CVPR), pages 909–918, 2019.
- Synthesizing structured cad models with equality saturation and inverse transformations. In Proc. ACM SIGPLAN Conference on Programming Language Design and Implementation, 2020.
- OpenAI. ChatGPT (v4, June 13 version) [Large language model]. https://chat.openai.com, 2023.
- Robust change captioning. In Proc. IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Learning transferable visual models from natural language supervision. In Proc. International Conference on Machine Learning, pages 8748–8763, 2021.
- Texture: Text-guided texturing of 3d shapes. In ACM SIGGRAPH Conference Proceedings, 2023.
- Neurosymbolic Models for Computer Graphics. Computer Graphics Forum, 2023.
- Mvdecor: Multi-view dense correspondence learning for fine-grained 3d segmentation. In ECCV, 2022.
- Learning adaptive hierarchical cuboid abstractions of 3d shape collections. ACM Transactions on Graphics (TOG), 38(6):1–13, 2019.
- O-cnn: Octree-based convolutional neural networks for 3d shape analysis. ACM Transactions on Graphics (Proc. SIGGRAPH), 36(4), 2017.
- Fusion 360 gallery: A dataset and environment for programmatic cad construction from human design sequences. ACM Transactions on Graphics (Proc. SIGGRAPH), 40(4), 2021.
- Deepcad: A deep generative network for computer-aided design models. In Proc. IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
- Hierarchical neural coding for controllable cad model generation. In Proc. International Conference on Machine Learning (ICML), 2023.
- Capri-net: Learning compact cad shapes with adaptive primitive assembly. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- Adding conditional control to text-to-image diffusion models. In Proc. IEEE/CVF International Conference on Computer Vision (ICCV), pages 3836–3847, 2023.