Frontier-Aware BFS Tokenization

Updated 16 December 2025

The paper introduces frontier-aware BFS tokenization, a deterministic mesh-serialization method that preserves local face connectivity during autoregressive mesh generation.
It employs a FIFO frontier structure to maintain a contiguous local context, ensuring that truncated-window models access the complete surface neighborhood.
Empirical results demonstrate significant reductions in mesh artifacts such as holes and fragmentation, thereby enhancing geometric fidelity.

Frontier-aware Breadth-First Search (BFS) tokenization is a deterministic mesh-serialization method designed for autoregressive generation of triangle meshes. Introduced in the MeshRipple framework, this approach addresses the degradation of mesh connectivity and geometric fidelity caused by standard face serialization under memory-limited, truncated-window training. Frontier-aware BFS tokenization enforces a wavefront expansion of the mesh, preserving local adjacency in the tail of the token sequence and ensuring that autoregressive models always have access to the full local surface context necessary for coherent next-face prediction. This technique has been empirically shown to substantially reduce artifacts such as holes and fragmented components in generated meshes (Lin et al., 8 Dec 2025).

1. Mesh Tokenization and the Problem of Local Context

Let a triangle mesh $\mathcal{M} = (\mathcal{V}, \mathcal{F}, \mathcal{A})$ consist of quantized vertices $\mathcal{V} = \{v_i\}_{i=1}^{N_v}$ , faces $\mathcal{F} = \{f_j\}_{j=1}^{N_f}$ (with $f_j = [v_{j,0}, v_{j,1}, v_{j,2}]$ ), and an adjacency relation $\mathcal{A} \subseteq \mathcal{F} \times \mathcal{F}$ . The tokenization problem is the mapping $\mathcal{M} \mapsto \mathcal{S} = (s_1, s_2, ..., s_L)$ , where $\mathcal{S}$ is a sequence of face-coordinate plus control tokens; autoregressive models then predict $p(s_t\mid s_{<t})$ for $t=1\ldots L$ .

Standard serialization strategies (lexico-vertex sorting, depth-first traversals, patch/block encodings) do not ensure that the locally adjacent faces to $f_t$ reside within the final $W$ tokens of $\mathcal{S}$ . Under truncated window conditioning—where the model is given only the last $W$ tokens per step—this causes frequent loss of essential local connectivity, leading to the generation of surfaces with holes and disconnected components. This suggests that serialization order critically affects the model's ability to maintain mesh integrity under resource-constrained training.

2. Definition and Construction of the Frontier

At every generation step $t$ , a frontier set $F_t \subseteq \mathcal{F}$ is maintained:

$F_t = \{\,f\in S_t : \exists\,f'\notin S_t,\,(f,f')\in\mathcal{A}\}$

where $S_t = \{f_1,...,f_t\}$ is the set of faces traversed so far. $F_t$ is the set of already-visited faces that bound at least one unvisited neighbor; it is maintained as an ordered FIFO queue $\mathcal{B}_t$ . For each newly added face $f_t$ , a root pointer $r_t \in F_{t-1}$ is recorded, specifying from which frontier face $f_t$ is grown. This root tracking enables deterministic reconstruction of the BFS expansion and ensures context alignment during token sequence traversal.

3. Breadth-First Expansion Algorithm

The BFS tokenization proceeds as follows:

Initialization: Select a seed face $f_1$ . Set $S_1 = \{f_1\}$ and $F_1 = \{f_1\}$ .
Recurrence: For $t \geq 2$ :

$\begin{aligned} &\text{Choose }f_t\text{ adjacent to } r_{t-1} = \min F_{t-1} \text{ (FIFO order)} \ &S_t = S_{t-1} \cup \{f_t\} \ &F_t = \left(F_{t-1} \setminus \{r_{t-1} \mid \nexists f'\notin S_{t-1} : (r_{t-1}, f') \in \mathcal{A}\}\right) \cup \{f_t\} \end{aligned}$

Each dequeue expands the mesh from the front of $\mathcal{B}$ , using a fixed local ordering (e.g., CCW) for half-edges. New faces are enqueued at the back, forming a ripple-shaped expansion.

Pseudocode is provided in Algorithm 1 of (Lin et al., 8 Dec 2025), specifying initialization, queue maintenance, and root pointer registration. Multiple connected components are supported, with each new component marked by a control token.

4. Analytical Example and Complexity

Consider a tetrahedral mesh with vertices $v_0, v_1, v_2, v_3$ and faces:

$\begin{align*} f_1 &= (v_0, v_1, v_2) \ f_2 &= (v_0, v_2, v_3) \ f_3 &= (v_0, v_3, v_1) \ f_4 &= (v_1, v_3, v_2) \end{align*}$

The BFS sequence proceeds by enqueuing and expanding via the FIFO frontier. The output token sequence and root pointer array are:

$\mathcal{S} = [\mathtt{BOS}, N, f_1, f_2, f_3, f_4], \quad \{r\} = [\mathtt{nil}, 1, 1, 2].$

Complexity: Each face is queued and dequeued once; each half-edge is visited $O(1)$ times, yielding $O(N_f + N_e) \approx O(N_f)$ runtime and $O(N_f)$ memory for the frontier queue. Indexing the frontier and roots in a truncated window is $O(W)$ .

5. Topological Alignment and Impact on Mesh Connectivity

Every predicted face $f_t$ is adjacent to the root $r_{t-1} \in F_{t-1}$ , where $F_{t-1}$ resides in the tail of $\mathcal{S}$ . Thus, a context window of the last $W$ tokens always contains the full adjacent surface context required for $f_t$ 's prediction. This topology-respecting serialization sharply contrasts with coordinate-sorted or patch/block orders, under which adjacency may be broken by arbitrary token sequence arrangements.

Empirical observations from (Lin et al., 8 Dec 2025) indicate that frontier-aware BFS tokenization substantially reduces mesh holes and fragmentation. The alignment ensures that truncated-window training and inference consistently operate with the true local topology, mitigating failure cases endemic to non-topology-aware serialization.

6. Empirical Results and Ablation Findings

Ablation studies in MeshRipple demonstrate the critical impact of the frontier-aware mask and BFS tokenization. Key findings include:

Removing the frontier mask increases Chamfer Distance by 2.12 and Hausdorff Distance by 0.0141.
The full model, employing frontier BFS tokenization, achieves optimal connectivity metrics (Normal Consistency, reduced broken surfaces).
Window size ablation (Table 8) shows modest improvement going from $W = 1\,\text{k}$ (CD $=0.05195$ , HD $=0.10913$ , NC $=0.7966$ ) to $W = 2\,\text{k}$ (CD $=0.05065$ , HD $=0.10454$ , NC $=0.7981$ ), confirming that most critical context is concentrated in the BFS-tail.

Window Size	Chamfer Distance	Hausdorff Distance	Normal Consistency
1k	0.05195	0.10913	0.7966
2k	0.05065	0.10454	0.7981

A plausible implication is that frontier-aware BFS tokenization effectively localizes all predictive dependencies to a compact context, supporting scalable autoregressive modeling even under aggressive sliding-window truncation.

7. Summary and Broader Implications

Frontier-aware BFS tokenization, as formalized in MeshRipple, defines a deterministic, topology-aligned sequence of mesh faces, maintaining a FIFO frontier of boundary faces and recording root pointers to trace the mesh expansion. This approach guarantees that truncated autoregressive models always condition on the true local geometric neighborhood, leading to substantially enhanced mesh connectivity and surface fidelity. Its adoption addresses a central obstacle in generative mesh modeling under memory constraints and establishes a methodical basis for context-preserving serialization in 3D surface generation tasks (Lin et al., 8 Dec 2025).

PDF Markdown Chat (Pro)

References (1)

MeshRipple: Structured Autoregressive Generation of Artist-Meshes (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Frontier-aware BFS Tokenization.