Papers
Topics
Authors
Recent
2000 character limit reached

Brep2Text: CAD-to-Language Dataset

Updated 26 December 2025
  • Brep2Text is a comprehensive dataset that bridges B-rep CAD models and natural language by providing over 269,444 QA pairs for detailed geometric and semantic queries.
  • It employs a canonical graph-based representation with adaptive sampling to robustly encode faces, edges, and topology for advanced 3D reasoning.
  • The dataset underpins contrastive pretraining and progressive LLM fine-tuning, boosting performance in 3D captioning, query response, and industrial design applications.

Brep2Text is a large-scale dataset and annotation protocol designed to bridge boundary representation (B-rep) CAD models and natural language, enabling native 3D geometry–text understanding for neural architectures. Brep2Text provides over 269,444 Brep–text question–answer pairs and underpins recent advances in multimodal representation learning, including the BrepLLM framework, which aligns B-rep graph embeddings with text embeddings via contrastive pretraining and progressive LLM fine-tuning (Deng et al., 18 Dec 2025). Brep2Text standardizes input formats, sampling, semantic coverage, and answer accuracy for 3D CAD-to-language tasks.

1. Motivation and Scope

The Brep2Text dataset addresses the long-standing modality gap between structured 3D CAD data (B-reps) and token-based natural LLMs. Traditional LLMs are intrinsically limited in parsing and reasoning about the topological and parametric attributes embedded in B-rep formats, which are crucial for industrial design, semantic retrieval, and intelligent manufacturing (Dai et al., 10 Apr 2025). Brep2Text enables supervised and contrastive training on 3D geometry–language pairs, facilitating direct query, captioning, instruction following, and reasoning tasks for native B-rep CAD models.

Brep2Text consists of diverse question–answer pairs targeting face, edge, and global topological properties, spatial relationships, manufacturing features, material attributes, and semantic instructions, curated with high reliability and industrial relevance (Deng et al., 18 Dec 2025).

2. Dataset Construction and Input Standardization

Brep2Text utilizes a canonical graph-based representation of B-rep geometry. Each CAD object is converted into a graph where nodes represent faces; edges encode adjacency (shared boundary curves). Face nodes store UV-sampled geometric tensors (positions, normals, curvatures, visibility, type, normalized area), and edge nodes contain curve samples (positions, tangents, type categorization, normalized lengths).

Sampling densities (faces: NS[16,32]N_S \in [16,32], edges: MC[16,32]M_C \in [16,32]) are adaptively determined from parametric surface area and curve length:

NS=Nminface+ASAminAmaxAmin(NmaxfaceNminface)N_S = N_{\min}^{\mathrm{face}} + \frac{A_S - A_{\min}}{A_{\max} - A_{\min}} (N_{\max}^{\mathrm{face}} - N_{\min}^{\mathrm{face}})

MC=Mminedge+Cminmaxmin(MmaxedgeMminedge)M_C = M_{\min}^{\mathrm{edge}} + \frac{\ell_C - \ell_{\min}}{\ell_{\max} - \ell_{\min}} (M_{\max}^{\mathrm{edge}} - M_{\min}^{\mathrm{edge}})

This encoding provides robust coverage of geometric diversity and topological complexity across B-rep models, allowing accurate and information-rich question–answer pair generation.

3. Annotation Protocol and Semantic Coverage

Brep2Text annotation leverages an adaptive protocol for generating high-coverage QA pairs. Question types span:

  • Geometric queries: precise surface area, curvature, orientation, visibility, and trimmed curve specifics
  • Topology: adjacency graph traversal (face–face, edge–face, loop enumeration)
  • Semantic feature recognition: manufacturing feature labeling, annotation of face types (plate, extrusion, hole, fillet, etc.)
  • Parametric and material attributes
  • Global descriptors: bounding box, mass properties, logical relationships
  • Natural language instructions: text-based edits, reasoning, and interpretive tasks for downstream LLM training

The dataset is programmatically generated from B-rep graph structures, ensuring annotation completeness and correctness without external human supervision. All pairs are curated to maintain explicit links between graph entities and their descriptive or instructive language (Deng et al., 18 Dec 2025).

4. Alignment with Foundation Models and Training Pipelines

Brep2Text is designed to directly support cross-modal alignment pipelines. Procedures include:

  • Contrastive pretraining: global graph embeddings from hierarchical BrepEncoders are aligned with frozen CLIP text embeddings via symmetric InfoNCE loss,

LCLIP=12Ni=1N[logPii+logQii]\mathcal{L}_{\mathrm{CLIP}} = -\frac{1}{2N}\sum_{i=1}^N\left[\log P_{ii} + \log Q_{ii}\right]

where P and Q are normalized softmax distributions over shape↔text pairs.

  • Multi-stage fine-tuning: node tokens from the graph encoder are mapped to LLM token space (e.g., BLIP-2 Q-Former), enabling direct ingestion and reasoning across the graph sequence (Deng et al., 18 Dec 2025).
  • Mixture-of-Query Experts (MQE): model enhancements for geometric diversity handling.

The Brep2Text dataset ensures all geometric/topological information is encoded in formats compatible with these alignment and fine-tuning strategies, maximizing model performance on downstream 3D captioning, reasoning, and query response.

5. Empirical Performance and Ablation Analyses

Brep2Text, when used for pretraining and evaluation, supports state-of-the-art results in 3D object classification and captioning. Ablation studies confirm:

  • Adaptive UV sampling increases classification accuracy by 2–3% in early feature extraction stages.
  • Hierarchical feature extraction (combining face, edge, and topology branches) consistently increases downstream model accuracy by approximately 2–3% (Deng et al., 18 Dec 2025).
  • Fine-grained geometric queries and structurally-aware language instruction tasks are better handled due to Brep2Text’s explicit annotation of adjacency and parametric relationships.

Experiments demonstrate superior CAD-to-language alignment, outperforming prior token-based protocols in semantic understanding and robust geometric reasoning.

6. Applications and Impact

Brep2Text is instrumental for next-generation neural architectures supporting:

  • Direct multimodal querying of CAD objects (“describe face 12 in this part”)
  • Feature-level and part-level captioning and summarization
  • Instruction following for CAD edits via language
  • Semantic retrieval in large industrial design and manufacturing databases
  • Validation of cross-modal representation learning (Brep–text contrastive learning, LLM fine-tuning)

Its standardized QA benchmarks and comprehensive annotation make Brep2Text foundational for research and industrial applications requiring tight integration of native CAD geometry and natural language understanding (Deng et al., 18 Dec 2025).

7. Limitations and Extension Directions

Brep2Text is limited by coverage of existing CAD datasets and assumption of reliable parametric sampling. Future work may address:

  • Inclusion of more granular manufacturing process semantics
  • Extension to dynamic or history-based B-rep edits
  • Augmentation with multi-turn dialog, repair, and simulation tasks
  • Cross-domain transfer learning (medical, architectural, organic freeform CAD)

These would further enhance the representational power and applicability of Brep2Text, consolidating its role as the bridge between symbolic engineering design and neural language modeling.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Brep2Text.