Brep2Text: CAD B-rep to Text Dataset
- Brep2Text is a large instruction-tuning dataset that pairs native 3D B-rep models with dual-level question–answer pairs for both semantic and procedural reasoning.
- The dataset systematically binds each CAD model with abstract and beginner-level queries, facilitating detailed classification and step-by-step construction explanations.
- Brep2Text underpins multi-stage fine-tuning and cross-modal alignment, advancing generative 3D captioning and object classification in CAD systems.
Brep2Text is a large-scale, instruction-tuning dataset designed to facilitate the interpretation, reasoning, and language-based description of native 3D Boundary Representation (B-rep) models in computer-aided design (CAD). Developed as the core supervision corpus for BrepLLM—a framework for enabling LLMs to natively read and reason about B-rep data—Brep2Text uniquely aligns topologically-rich CAD solids with natural language through a question–answer paradigm, supporting evaluation and multi-stage fine-tuning of models on both semantic and procedural reasoning tasks (Deng et al., 18 Dec 2025).
1. Origin and Objectives
Brep2Text was constructed to address the absence of large-scale benchmarks linking native B-rep representations with natural-language queries and responses. Its central aim is to equip LLMs with the capacity to process and interpret parametrically defined CAD data, advancing the state of instruction-tuned 3D geometry understanding. The dataset leverages 134,722 parameterized CAD solids from the Text2CAD corpus, with each solid processed to yield question–answer (QA) items at two reasoning levels: high-level semantic categorization (“abstract”) and low-level procedural construction (“beginner”). Both target the need for models to handle a diverse range of CAD query tasks, from classification to detailed build-step explanations.
2. Dataset Construction and Statistics
Brep2Text systematically pairs each unique B-rep model with two QA exemplars, using the following methodology:
- Source: 134,722 unique parameterized B-rep CAD models (from Text2CAD).
- QA Pairing:
- Each B-rep yields one abstract-level and one beginner-level QA pair.
- The answer to each question is verbatim from the original, human-annotated Text2CAD description, ensuring linguistic grounding in real-world designer annotation rather than artificial templates.
- Question prompts are synthesized automatically by the Qwen-Max model in zero-shot mode, conditioned on the human-provided answer.
Tabular summary:
| Split | No. of Brep Models | No. of QA Pairs | Question Levels |
|---|---|---|---|
| Training | 134,522 | 269,044 | Abstract + Beginner |
| Test | 200 | 400 | Abstract + Beginner |
| Total | 134,722 | 269,444 | Abstract + Beginner |
No separate validation split is predefined; practitioners may assemble one using a subset of training data as needed.
3. Data Representation and Structure
Each entry in Brep2Text is a triplet that binds geometric data and natural language:
- A. Brep Model Tokenization:
- Internally, each B-rep is a topology-aware graph over sampled faces and edges.
- The BrepEncoder within BrepLLM transforms this into:
- A global feature vector , used for contrastive pre-training.
- A sequence of face node tokens , representing geometric detail for downstream fine-tuning in the LLM.
- B. Question:
- Natural language, type .
- C. Answer:
- Human-written Text2CAD annotation text.
Example instance (pseudo-JSON):
1 2 3 4 5 6 |
{
"model_id": "Brep_000123",
"brep_tokens": [[h_1], [h_2], ..., [h_n]],
"question": "What type of CAD part is this? Describe its function.",
"answer": "This is a rectangular bracket used to support an overhanging beam. It features two triangular gussets on the front face and a central slot for mounting."
} |
A strict 1:1 mapping exists between the two question levels for each model, guaranteeing .
4. Annotation Policy and Generation Pipeline
Brep2Text forgoes hand-written question templates in favor of automated, conditioned synthesis using Qwen-Max in zero-shot mode. This methodology is designed to maximize linguistic diversity and semantic alignment to real CAD usage, as questions are solicited based on the original human answer. Two levels are defined:
- Abstract: Queries about part function, use case, or classification.
- Beginner: Queries about primary modeling steps or procedural genesis.
No explicit inter-annotator agreement or error statistics are reported for question synthesis. The paper emphasizes that, by relying on the original human Text2CAD answers and asking Qwen-Max to condition questions on these, an indirect guarantee of relevance and semantic fidelity is achieved. Human evaluation metrics on downstream model outputs, such as precision and correctness, indirectly validate the underlying dataset quality.
5. Benchmark Usage and Evaluation Metrics
Brep2Text is the central corpus for BrepLLM's cross-modal alignment and multi-stage instruction-tuning, providing supervision for mapping from encoded Brep tokens and question prompts to annotated answers. Its dual-query structure enables evaluation of both semantic understanding (“What is this?”) and procedural reasoning (“How was this model built?”), and supports two downstream tasks:
- 3D Object Captioning (low-level procedural/captioning):
- Prompt: “How was this CAD model constructed? Please describe in detail.”
- Metrics: Qwen-Max automatic score, embedding similarity (Sentence-BERT, SimCSE), human-rated correctness, hallucination, and precision.
- On the held-out 200-model test set, SOTA results reported are:
- Qwen-Max: 58.89 (↑2.31 over next best)
- Sentence-BERT: 73.05
- SimCSE: 74.46
- Human Precision: 81.85%
- Generative 3D Object Classification (high-level semantic/categorical):
- Prompts: “What is this?” (I), “This is an object of …” (C)
- Metric: Qwen-Max category accuracy.
- Reported results: I = 57.40%, C = 56.70%, Average = 57.05% (↑2.15% vs. prior best).
Brep2Text thereby constitutes both a pre-training corpus for cross-modal Brep-text contrastive alignment, and the principal instruction-supervised dataset for generative CAD part understanding.
6. Significance and Application Scope
Brep2Text is the first dataset to furnish a large-scale, instruction-level pairing between native B-rep CAD models and natural language, filling a research gap for fully automated, graph-structured 3D-part language reasoning tasks. It enables benchmarking and model development for:
- Instruction-tuned 3D geometric reasoning.
- Natural-language 3D part description and generative classification.
- Cross-modal contrastive learning and multi-stage fine-tuning bridging geometry, vision, and language.
The dataset’s grounding in real designer annotation and its dual-level taxonomy ensure broad coverage of both semantic and procedural regimes, supporting research across generative design, CAD automation, and explainable geometry modeling. Furthermore, Brep2Text indirectly sets a new baseline for SOTA results in CAD captioning and category generation from native B-rep input (Deng et al., 18 Dec 2025).