MeshRipple: Autoregressive Mesh Generation
- MeshRipple is a structured autoregressive framework that generates 3D triangle meshes through a frontier expansion process analogous to ripple propagation.
- It employs expansive prediction with candidate masking and a dedicated transformer head to enforce connectivity and surface integrity.
- It integrates a sparse-attention dual-memory mechanism to capture local and global dependencies, achieving state-of-the-art fidelity and geometric performance.
MeshRipple refers to a structured autoregressive framework for 3D triangle mesh generation that expands surfaces from an active frontier in a manner analogous to a ripple propagating over a surface. The approach integrates a frontier-aware breadth-first search (BFS) tokenization, an expansive prediction mechanism responsible for coherent connectivity, and a sparse-attention dual-memory architecture to capture both local and global mesh dependencies. MeshRipple achieves state-of-the-art fidelity and topological completeness in both dense and artistically stylized mesh synthesis, outperforming recent autoregressive models across standard geometric benchmarks (Lin et al., 8 Dec 2025).
1. Formal Specification and Tokenization
Meshes are represented as , where is a set of quantized vertex coordinates and is a sequence of oriented faces (typically triangles with counterclockwise ordering). MeshRipple serializes faces into an ordered sequence via a half-edge-based BFS. The procedure begins with a seed face , maintaining a frontier queue of faces adjacent to the active boundary. For each dequeued face , its three outgoing half-edges (in fixed CCW order) are examined; any unvisited adjacent face is marked, enqueued, and appended to the output serialization .
For each newly serialized face , a root index is stored, indicating the frontier face from which was reached. The set of active frontier faces at step consists of those in with unvisited neighbors. This tokenization guarantees that the neighborhood required for predicting lies within , which typically resides near the tail of . Thus, truncated windows of the last tokens generally suffice for local context. Each face token is uniquely determined by
plus its vertex triple (Lin et al., 8 Dec 2025).
2. Expansive Prediction and Connectivity Enforcement
MeshRipple incorporates a joint prediction strategy to enforce surface connectedness and mesh integrity. At each generation step, the model predicts:
- The triangle to attach to the current root face ,
- The root offset , defining the position of the next root pointer within .
Mathematically, let and denote current and next root indices, so . The model factorizes the conditional as
where the root-offset term is predicted by a dedicated head on a mid-level transformer layer. During inference, the set of candidate triangle tokens is masked to include only those sharing a half-edge with the current root , strictly enforcing surface attachment and eliminating fragmented or disconnected span. Training minimizes combined cross-entropy over both outputs: (Lin et al., 8 Dec 2025).
3. Sparse Global Memory via NSCA
Standard sliding-window architectures impose hard limits on receptive field size, impeding the capture of long-range geometric dependencies and symmetry. MeshRipple addresses this by introducing Non-overlapping Sparse Compressed Attention (NSCA), which hierarchically compresses all but the most recent tokens:
- The serialized sequence is partitioned into non-overlapping blocks of size .
- Each block is summarized by a learned MLP producing compressed key-value pairs.
- At each decoding step:
- Block-level scoring identifies the most relevant blocks via query-key dot-products,
- For these blocks, the original token sequences are retrieved and attended at finer granularity,
- Local attention over the immediate window proceeds concurrently.
Causal masking ensures strictly autoregressive access; tokens and blocks beyond the current generation index are unobservable. This architecture provides an effectively unbounded receptive field while keeping per-token time/memory complexity sub-quadratic: . Setting and prevents out-of-memory failures encountered in dense attention (Lin et al., 8 Dec 2025).
4. Generation Loop and Inference Dynamics
The core MeshRipple autoregressive loop involves:
- Initializing with a start token and the frontier with the seed face,
- At each step:
- Preparing the AR input window (last faces),
- Computing hidden states using Transformer blocks with custom attention masks,
- Predicting the next face with candidate masking for admissible attachments,
- Predicting the root offset ,
- Updating sequence, frontier, and root pointer via ,
- Iterating until a stopping criterion is met.
Out-of-window context is accessible through NSCA, while the explicit frontier and attachment constraint guarantee the mesh surface grows as a single expanding component (Lin et al., 8 Dec 2025).
5. Empirical Performance and Ablations
MeshRipple has been evaluated against leading mesh autoregressive generators on both dense-mesh and artist-mesh datasets. Key geometric metrics include Chamfer distance (CD), Hausdorff distance (HD), and Normal Consistency (NC):
| Model | Chamfer (×10⁻³) ↓ | Hausdorff ↓ | Normal Consistency ↑ |
|---|---|---|---|
| MeshAnythingV2 | 109.64 | 0.2314 | –0.0096 |
| BPT | 60.19 | 0.1089 | 0.6066 |
| DeepMesh | 50.27 | 0.0893 | 0.6025 |
| MeshRipple | 48.73 | 0.1057 | 0.6280 |
On artist-mesh benchmarks, MeshRipple also outperforms FastMesh and BPT, with lowest CD and HD and highest NC (Lin et al., 8 Dec 2025). Qualitative assessments indicate that MeshRipple avoids the large holes and disconnected fragments observed in fixed-order AR baselines.
Ablation studies demonstrate:
- Removing the frontier mask increases CD by +2.12 and HD by +0.0141,
- Removing context injection increases CD by +7.89, HD by +0.0118,
- Removing the root constraint sharply increases CD (+12.41) and HD (+0.0122),
- Omitting NSCA only slightly affects CD (–0.24) but induces significant resource overhead.
6. Limitations and Prospects for Extension
MeshRipple can be sensitive to label noise and to meshes with extreme non-manifold topologies; in such regimes prediction coherence may degrade. The NSCA architecture in principle enables scaling beyond 100k faces, contingent on further increases in input quantization granularity and block sizing. Adaptation to quadrilateral (quad) and hexahedral mesh structures is a natural extension. The methodology is also compatible with reinforcement-learning fine-tuning (DPO/RL-HF) for mesh stylization as demonstrated in related works such as DeepMesh and MeshRFT.
A plausible implication is that the explicit frontier-tracking mechanism, when coupled with global memory retrieval, may serve as a generalizable approach for structured autoregressive generation in other domains with complex topological dependencies (Lin et al., 8 Dec 2025).
7. Comparison with Dynamic Mesh Approaches in Physics
The term “MeshRipple” has also been used in the context of numerically simulating ripple morphodynamics due to colloidal deposition in flowing fluids (Hewett et al., 2018). In Hewett & Sellier’s work, the arbitrary Lagrangian–Eulerian (ALE) approach is used to couple fluid flows with evolving boundary geometries by tracking mesh nodes and enforcing mesh quality via smoothing and shuffle algorithms. The similarity to the generative MeshRipple approach is nominal; in physics, “ripple” refers to evolving physical undulations on a material interface, modeled via dynamic mesh deformation and interface velocity determined by particle flux. In contrast, in generative modeling, “MeshRipple” refers to a deterministic surface traversal and completion process inspired by ripple propagation.
Both contexts, however, address the preservation of surface connectivity and mesh quality—either physically, via dynamic node management, or probabilistically, via autoregressive growth with enforced attachment constraints (Hewett et al., 2018, Lin et al., 8 Dec 2025).