BrepARG: Transformer-Based B-Rep Generation
- BrepARG is a transformer-based autoregressive framework that encodes complete B-Rep geometry and topology into a unified token sequence.
- It uses a holistic tokenization scheme combining geometry, position, and face-index tokens to preserve topological relationships during generation.
- Empirical results show BrepARG outperforms prior methods with higher validity and efficiency on standard CAD datasets.
BrepARG is a transformer-based autoregressive B-Rep generation framework that encodes the complete geometry and topology of boundary representations (B-Reps) into holistic token sequences, enabling direct, sequence-based B-Rep synthesis and state-of-the-art unconditional generation performance. By unifying faces, edges, and their relationships into a discrete token stream, BrepARG overcomes key limitations of prior graph-based generative pipelines and demonstrates the feasibility of large-scale, topologically valid, and efficient sequence-based B-Rep modeling (Li et al., 23 Jan 2026).
1. Problem Formulation and Autoregressive Objective
BrepARG addresses the generative modeling of B-Reps—the de facto CAD representation—by learning the joint distribution over CAD models expressed as discrete token sequences. Let a B-Rep be represented as
where is the unified vocabulary of all possible tokens. BrepARG models the joint likelihood using the standard autoregressive factorization: Training minimizes the negative log-likelihood (full next-token cross-entropy) over a dataset of such sequences. This autoregressive modeling enables sampling, interpolation, and possibly completion in the tokenized B-Rep domain (Li et al., 23 Jan 2026).
2. Holistic Tokenization Scheme
Accurate and invertible tokenization is fundamental in BrepARG; it adopts a joint scheme covering geometry, position, and topology:
- Geometry Tokens: Each face is sampled as and each edge as (broadcast to ). Both are encoded using a VQ-VAE: after aggressive downsampling, each primitive is mapped to 4 quantized codebook indices from a learned geometric vocabulary (size ).
- Position Tokens: Each primitive’s axis-aligned bounding box is quantized into 6 discrete tokens using bins per scalar.
- Face-Index Tokens: Each face is assigned a unique token (), and each edge block records the indices of its adjacent faces via two face-index tokens, encoding topological incidence within the sequence.
These token types are unified and organized using fixed integer offsets within the global vocabulary to avoid collisions (Li et al., 23 Jan 2026).
3. Hierarchical Sequence Construction and Topological Serialization
BrepARG constructs token sequences in a manner that both respects topological adjacency and enhances autoregressive modeling:
- Geometry Blocks: A face block encodes ; an edge block encodes .
- Face Ordering: Faces are serialized by first selecting the high-degree face, then applying a DFS that favors topological locality, ensuring that adjacent faces are placed near each other in sequence.
- Edge Ordering: Edges are sorted by the maximum adjacent face-index (MAX-IDX-A), promoting topological coherence.
- Final Sequence Assembly: The global sequence is , where and are the face and edge block sequences, and special tokens explicitly delineate regions ([START], [SEP], [END]) (Li et al., 23 Jan 2026).
4. Transformer-Only Autoregressive Architecture
The generative model is a decoder-only transformer with 8 transformer blocks, each with 8 attention heads, a model dimension of 256, and feed-forward dimension 1024. Each input token is embedded via a learned embedding and a separate positional embedding. Causal masking is used to enforce that at position , token depends only on . The architecture proceeds as: with
At each position, output logits are projected over the entire vocabulary, and next-token probabilities are calculated via softmax. The complete token sequence is generated via nucleus sampling (top-, [DeepCAD], $0.8$ [ABC]), terminating at [END] or maximum sequence length (Li et al., 23 Jan 2026).
5. Model Training and Data
BrepARG is trained on clean CAD datasets filtered to models with consistent topology (50 faces, 30 edges/face), notably DeepCAD (80,509 models), ABC (105,798 models), and a smaller furniture subset (1,065 models, with class-conditional labels):
- VQ-VAE Training: Performed with AdamW ( LR, decay), large batch sizes (2,048–8,192), and 12 hours of GPU time for codebook learning.
- Transformer Training: Teacher-forcing cross-entropy, batch size 128, 500 epochs, AdamW ( LR, decay), 17 hours on 4 GPUs.
- Both geometry and position codebooks are pretrained (to avoid codebook collapse, online reinitialization and probabilistic nearest-neighbor assignment are used). During autoregressive training, full cross-entropy is minimized over all positions (Li et al., 23 Jan 2026).
6. Performance Evaluation and Empirical Results
BrepARG’s evaluation follows established B-Rep generative modeling metrics:
| Method | COV↑ | MMD↓ | JSD↓ | Novel↑ | Unique↑ | Valid↑ |
|---|---|---|---|---|---|---|
| DeepCAD | 70.81% | 1.31 | 1.79 | 93.80% | 89.79% | 58.10% |
| BrepGen | 72.38% | 1.13 | 1.29 | 99.72% | 99.18% | 68.23% |
| DTGBrepGen | 74.52% | 1.07 | 1.02 | 99.79% | 98.94% | 79.80% |
| BrepARG (Ours) | 75.45% | 0.89 | 1.02 | 99.82% | 99.80% | 87.60% |
These results (DeepCAD benchmark, unconditional generation) demonstrate that BrepARG outperforms previous methods in distributional alignment (highest Coverage, lowest MMD and JSD) and achieves substantial improvements in validity. On ABC, BrepARG increases the unconditional validity rate to 67.54%, compared to 57.59% for DTGBrepGen. BrepARG also increases inference speed: 1.5 s/model vs. 3.6 s/model for DTGBrepGen (Li et al., 23 Jan 2026).
Ablation studies verify that nucleus sampling with (DeepCAD) and topology-aware ordering of faces/edges are critical to optimal trade-offs between validity and diversity.
7. Relation to Prior and Contemporary Work
Prior graph-based B-Rep representations (e.g., BrepGen, DTGBrepGen) separated geometric and topological features, requiring multi-stage and often cascade-style decoding pipelines, precluding direct application of sequence-based generative frameworks. BrepARG’s principal novelty is its holistic token sequence representation and its proof that both geometric and topological aspects can be captured, learned, and generated within a single stream. This paradigm is independently validated in subsequent works, such as AutoBrep (Xu et al., 2 Dec 2025), which further unifies geometry and topology with breadth-first topological ordering, and in particle-based approaches that operate on unordered sets (Lu et al., 25 Jan 2026).
8. Significance, Limitations, and Future Directions
BrepARG establishes, for the first time, that autoregressive token-based transformers can synthesize B-Reps with high fidelity, topological correctness, and efficiency. This opens a new direction for CAD generative modeling. A plausible implication is that similar holistic tokenizations may generalize to other cell-complex or combinatorial geometry domains.
Potential limitations include the need for careful data filtering to ensure topological consistency, bounded sequence lengths due to fixed vocabulary ranges, and challenges in expansion to extremely complex or non-manifold B-Reps without further extensions to the encoding scheme. Future enhancements may involve multi-modal conditioning, integration of edit operations, and extension to open or hybrid B-Rep boundary structures (Li et al., 23 Jan 2026).