Papers
Topics
Authors
Recent
Search
2000 character limit reached

BrepARG: Transformer-Based B-Rep Generation

Updated 30 January 2026
  • BrepARG is a transformer-based autoregressive framework that encodes complete B-Rep geometry and topology into a unified token sequence.
  • It uses a holistic tokenization scheme combining geometry, position, and face-index tokens to preserve topological relationships during generation.
  • Empirical results show BrepARG outperforms prior methods with higher validity and efficiency on standard CAD datasets.

BrepARG is a transformer-based autoregressive B-Rep generation framework that encodes the complete geometry and topology of boundary representations (B-Reps) into holistic token sequences, enabling direct, sequence-based B-Rep synthesis and state-of-the-art unconditional generation performance. By unifying faces, edges, and their relationships into a discrete token stream, BrepARG overcomes key limitations of prior graph-based generative pipelines and demonstrates the feasibility of large-scale, topologically valid, and efficient sequence-based B-Rep modeling (Li et al., 23 Jan 2026).

1. Problem Formulation and Autoregressive Objective

BrepARG addresses the generative modeling of B-Reps—the de facto CAD representation—by learning the joint distribution over CAD models expressed as discrete token sequences. Let a B-Rep be represented as

S=(t1,t2,,tT),tiV\mathcal{S} = (t_1, t_2, \ldots, t_T), \quad t_i \in V

where VV is the unified vocabulary of all possible tokens. BrepARG models the joint likelihood using the standard autoregressive factorization: Pθ(S)=i=1TPθ(tit<i)P_\theta(\mathcal{S}) = \prod_{i=1}^T P_\theta(t_i \mid t_{<i}) Training minimizes the negative log-likelihood (full next-token cross-entropy) over a dataset of such sequences. This autoregressive modeling enables sampling, interpolation, and possibly completion in the tokenized B-Rep domain (Li et al., 23 Jan 2026).

2. Holistic Tokenization Scheme

Accurate and invertible tokenization is fundamental in BrepARG; it adopts a joint scheme covering geometry, position, and topology:

  • Geometry Tokens: Each face is sampled as FR32×32×3\mathbf{F} \in \mathbb{R}^{32 \times 32 \times 3} and each edge as ER32×3\mathbf{E} \in \mathbb{R}^{32 \times 3} (broadcast to 32×32×332 \times 32 \times 3). Both are encoded using a VQ-VAE: after aggressive downsampling, each primitive is mapped to 4 quantized codebook indices {g1,g2,g3,g4}\{g_1, g_2, g_3, g_4\} from a learned geometric vocabulary (size NgeoN_\mathrm{geo}).
  • Position Tokens: Each primitive’s axis-aligned bounding box (xmin,ymin,zmin,xmax,ymax,zmax)[1,1]6(x_\mathrm{min}, y_\mathrm{min}, z_\mathrm{min}, x_\mathrm{max}, y_\mathrm{max}, z_\mathrm{max}) \in [-1,1]^6 is quantized into 6 discrete tokens using L=2048L=2048 bins per scalar.
  • Face-Index Tokens: Each face is assigned a unique token tiidx{0,1,...,nmax1}t^\mathrm{idx}_i \in \{0, 1, ..., n_\mathrm{max}-1\} (nmax50n_\mathrm{max}\approx50), and each edge block records the indices of its adjacent faces via two face-index tokens, encoding topological incidence within the sequence.

These token types are unified and organized using fixed integer offsets within the global vocabulary to avoid collisions (Li et al., 23 Jan 2026).

3. Hierarchical Sequence Construction and Topological Serialization

BrepARG constructs token sequences in a manner that both respects topological adjacency and enhances autoregressive modeling:

  • Geometry Blocks: A face block encodes [6 position tokens,  4geometrytokens,  1 face-index][\text{6 position tokens},\; 4 geometry tokens,\; 1 \text{ face-index}]; an edge block encodes [2 face-indices, 6 position tokens, 4 geometry tokens][2\ \text{face-indices},\ 6\ \text{position tokens},\ 4\ \text{geometry tokens}].
  • Face Ordering: Faces are serialized by first selecting the high-degree face, then applying a DFS that favors topological locality, ensuring that adjacent faces are placed near each other in sequence.
  • Edge Ordering: Edges are sorted by the maximum adjacent face-index (MAX-IDX-A), promoting topological coherence.
  • Final Sequence Assembly: The global sequence is [START, Sf, SEP, Se, END][\text{START},\ S_f,\ \text{SEP},\ S_e,\ \text{END}], where SfS_f and SeS_e are the face and edge block sequences, and special tokens explicitly delineate regions ([START], [SEP], [END]) (Li et al., 23 Jan 2026).

4. Transformer-Only Autoregressive Architecture

The generative model is a decoder-only transformer with 8 transformer blocks, each with 8 attention heads, a model dimension of 256, and feed-forward dimension 1024. Each input token tit_i is embedded via a learned embedding and a separate positional embedding. Causal masking is used to enforce that at position ii, token ti+1t_{i+1} depends only on tit_{\leq i}. The architecture proceeds as: H()=FFN(MHA(H(1)))\mathbf{H}^{(\ell)} = \mathrm{FFN}(\mathrm{MHA}(\mathbf{H}^{(\ell-1)})) with

MHA ⁣(H)i=jiαij(WVHj)\mathrm{MHA}\!\bigl(\mathbf{H}\bigr)_{i} = \sum_{j \leq i} \alpha_{ij} (W_V \mathbf{H}_j)

At each position, output logits are projected over the entire vocabulary, and next-token probabilities are calculated via softmax. The complete token sequence is generated via nucleus sampling (top-pp, p=0.9p=0.9 [DeepCAD], $0.8$ [ABC]), terminating at [END] or maximum sequence length (Li et al., 23 Jan 2026).

5. Model Training and Data

BrepARG is trained on clean CAD datasets filtered to models with consistent topology (<<50 faces, <<30 edges/face), notably DeepCAD (80,509 models), ABC (105,798 models), and a smaller furniture subset (1,065 models, with class-conditional labels):

  • VQ-VAE Training: Performed with AdamW (1×1041\times 10^{-4} LR, 1×1061\times 10^{-6} decay), large batch sizes (2,048–8,192), and 12 hours of GPU time for codebook learning.
  • Transformer Training: Teacher-forcing cross-entropy, batch size 128, 500 epochs, AdamW (1×1031\times 10^{-3} LR, 1×1021\times 10^{-2} decay), 17 hours on 4 GPUs.
  • Both geometry and position codebooks are pretrained (to avoid codebook collapse, online reinitialization and probabilistic nearest-neighbor assignment are used). During autoregressive training, full cross-entropy is minimized over all positions (Li et al., 23 Jan 2026).

6. Performance Evaluation and Empirical Results

BrepARG’s evaluation follows established B-Rep generative modeling metrics:

Method COV↑ MMD↓ JSD↓ Novel↑ Unique↑ Valid↑
DeepCAD 70.81% 1.31 1.79 93.80% 89.79% 58.10%
BrepGen 72.38% 1.13 1.29 99.72% 99.18% 68.23%
DTGBrepGen 74.52% 1.07 1.02 99.79% 98.94% 79.80%
BrepARG (Ours) 75.45% 0.89 1.02 99.82% 99.80% 87.60%

These results (DeepCAD benchmark, unconditional generation) demonstrate that BrepARG outperforms previous methods in distributional alignment (highest Coverage, lowest MMD and JSD) and achieves substantial improvements in validity. On ABC, BrepARG increases the unconditional validity rate to 67.54%, compared to 57.59% for DTGBrepGen. BrepARG also increases inference speed: 1.5 s/model vs. 3.6 s/model for DTGBrepGen (Li et al., 23 Jan 2026).

Ablation studies verify that nucleus sampling with p=0.9p=0.9 (DeepCAD) and topology-aware ordering of faces/edges are critical to optimal trade-offs between validity and diversity.

7. Relation to Prior and Contemporary Work

Prior graph-based B-Rep representations (e.g., BrepGen, DTGBrepGen) separated geometric and topological features, requiring multi-stage and often cascade-style decoding pipelines, precluding direct application of sequence-based generative frameworks. BrepARG’s principal novelty is its holistic token sequence representation and its proof that both geometric and topological aspects can be captured, learned, and generated within a single stream. This paradigm is independently validated in subsequent works, such as AutoBrep (Xu et al., 2 Dec 2025), which further unifies geometry and topology with breadth-first topological ordering, and in particle-based approaches that operate on unordered sets (Lu et al., 25 Jan 2026).

8. Significance, Limitations, and Future Directions

BrepARG establishes, for the first time, that autoregressive token-based transformers can synthesize B-Reps with high fidelity, topological correctness, and efficiency. This opens a new direction for CAD generative modeling. A plausible implication is that similar holistic tokenizations may generalize to other cell-complex or combinatorial geometry domains.

Potential limitations include the need for careful data filtering to ensure topological consistency, bounded sequence lengths due to fixed vocabulary ranges, and challenges in expansion to extremely complex or non-manifold B-Reps without further extensions to the encoding scheme. Future enhancements may involve multi-modal conditioning, integration of edit operations, and extension to open or hybrid B-Rep boundary structures (Li et al., 23 Jan 2026).

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to BrepARG.