ToothForge: Spectral Dental Shape Synthesis
- ToothForge is a spectral generative modeling framework that synchronizes eigenspaces to enable accurate 3D dental shape synthesis.
- It employs a β-VAE trained on synchronized spectral embeddings, reducing reconstruction error (e.g., MSE of 0.032 for molars) and noise in heterogeneous meshes.
- The approach supports efficient dental crown interpolation, compression, and smooth latent-space blending without relying on shared mesh connectivity.
ToothForge is a spectral generative modeling framework for high-fidelity 3D dental shape synthesis based on synchronized spectral embeddings. Designed to address the paucity of dental shape datasets and heterogeneity of mesh connectivities in medical applications, ToothForge enables automatic generation, analysis, and interpolation of dental crowns using compact representations in the frequency domain. The method introduces spectral synchronization—a procedure that aligns the eigenspaces of Laplace–Beltrami operators across arbitrary triangle meshes—allowing unified machine learning workflows independent of mesh connectivity. ToothForge incorporates a β-VAE trained over the synchronized spectra, resulting in improved reconstruction, greater flexibility for mesh structures, and practical applications in medical shape domains (Kubík et al., 3 Jun 2025).
1. Spectral Embeddings of Triangular Meshes
The core of ToothForge’s approach relies on spectral analysis of a watertight triangular mesh with vertices . Discretization of the Laplace–Beltrami operator proceeds via the cotangent-weight scheme, constructing a stiffness matrix with
where is an edge, and . The lumped mass matrix assigns . The discrete Laplacian is .
Eigenpairs of yield orthonormal eigenvectors (harmonics) and nonnegative eigenvalues (frequencies). Any per-vertex coordinate vector can be expanded as with the eigenvector matrix. Truncation to the first modes produces compact spectral embeddings, where the -banded modal coefficients
represent the mesh in the frequency domain. The mesh itself is recoverable as .
2. Instability of Raw Spectra and Motivation for Synchronization
Spectral decomposition on discrete meshes introduces ambiguities: eigenvectors are determined only up to sign flips, and in the presence of clustered eigenvalues, modes may arbitrarily permute or mix. When comparing two meshes with differing connectivity, the bases are, by construction, unrelated. Consequently, direct use of these raw spectral embeddings as machine learning features leads to severe artifacts, including sign-flip noise and basis-mismatch biases, degrading both reconstruction and generative sample quality (Kubík et al., 3 Jun 2025). This fundamental incompatibility precludes effective learning on heterogeneous mesh datasets.
3. Spectral Synchronization via Reference Alignment
ToothForge resolves incompatibility via spectral synchronization. One selects a reference mesh and computes its -truncated basis . For each training mesh , an orthogonal alignment map is determined, along with a vertex correspondence (often via nearest neighbor or coarse registration), to register the eigenbasis to . Alignment proceeds by solving
subject to , which is the orthogonal Procrustes problem. Its solution is , where . All spectral coefficients are then synchronized:
so that all data share the basis and are directly comparable in machine learning tasks.
4. Generative Modeling on Synchronized Spectra
The synchronized coefficients serve as the representation for generative modeling. ToothForge employs a β-VAE, where the encoder maps coefficients to a Gaussian posterior , and the decoder predicts reconstructed spectra . The architecture uses , latent dimension , and a five-stage CNN backbone. The loss function is
with annealed cyclically. Training utilizes only the synchronized spectra for each mesh.
5. Pipeline Overview: Training and Inference
The modeling workflow includes:
- Training:
- Compute the reference basis .
- For each mesh :
- Construct Laplacian and eigenbasis .
- Calculate raw spectral coefficients .
- Align basis by solving Procrustes for , yielding synchronized .
- Train the β-VAE on .
- Inference:
- Sample latent .
- Decode to obtain .
- Reconstruct vertices ; reuse faces and edges from reference mesh to build high-resolution output.
This pipeline operates without the necessity for shared mesh connectivity among samples, instead leveraging only spectral harmonization.
6. Quantitative Performance and Ablation
Evaluated on a private dataset of 430 dental crowns (incisors, premolars, molars), ToothForge demonstrates that training on synchronized spectra reduces spectral MSE to 0.032 (molars), an order of magnitude lower than models trained on unaligned data. Spatial Chamfer distances exhibit similar improvements. Ablation experiments show omission of synchronization introduces noise and artifacts in interpolation, while proper alignment yields smooth, anatomically consistent morphs. The combined effect of synchronization and β-regularization reduces minimum matching distance (MMD) below 0.0059 for all classes. Generation throughput is approximately 1 millisecond per 1,000 samples on a Tesla T4 (Kubík et al., 3 Jun 2025).
7. Additional Applications, Limitations, and Generalizability
ToothForge enables additional applications: (a) Efficient shape compression using moderate values of (128–256 modes) preserves fine morphological details such as cusps; (b) Latent-space interpolation in the regularized VAE latent space, thanks to shared spectral bases, produces smooth vertex-wise blends between shapes.
Limitations are present: the reliance on a single reference mesh and explicit vertex correspondences can be challenged by extreme deformations or topological differences. The alignment map is rigid; potential extensions include adopting learned functional maps or expanding the framework to anatomically diverse organs with heterogeneous mesh structure. The general principle of spectral synchronization—aligning mesh spectra prior to frequency-domain learning—is broadly transferable to medical shape modeling domains with inconsistent connectivity. Key benefits include amelioration of sign-flip and basis-mismatch noise, independence from connectivity constraints, and compact, high-fidelity generative modeling (Kubík et al., 3 Jun 2025).