Papers
Topics
Authors
Recent
2000 character limit reached

Teeth3DS Benchmark for 3D Dental Analysis

Updated 10 December 2025
  • Teeth3DS Benchmark is a large-scale framework for 3D dental scan analysis, covering tooth localization, segmentation, labeling, 3D modeling, and landmark detection.
  • It employs a human–machine hybrid pipeline with clinically validated annotations and standardized metrics like Dice, IoU, and Chamfer Distance for performance evaluation.
  • The benchmark includes a dedicated tooth completion task that simulates missing data to advance retrieval-augmented and transformer-based modeling methods.

The Teeth3DS benchmark is a large-scale, publicly available framework for the evaluation of automatic methods in 3D dental scan analysis, with an emphasis on tooth localization, segmentation, labeling, 3D modeling, and dental landmark identification. Developed as part of the 3DTeethSeg (2022) and 3DTeethLand (2024) MICCAI challenges and formalized in “Teeth3DS+,” it serves as a reference standard for developing, comparing, and reproducibly validating algorithms on intraoral scans with high-quality, clinically validated annotations covering over 23,999 teeth across diverse demographics (Ben-Hamadou et al., 2022, Ben-Hamadou et al., 2023).

1. Dataset Structure and Demographics

Teeth3DS consists of two principal releases:

  • The original dataset, published for 3DTeethSeg’22, assembled 1,800 high-resolution intraoral scans from 900 patients (each with both upper and lower jaws), totaling 23,999 individually annotated teeth. Scans were balanced across gender, age groups (70% under 16, 27% 16–59, 3% over 60), and clinical context (50% orthodontic, 50% prosthetic). Acquisition used Dentsply Primescan, 3Shape Trios 3, and iTero Element 2 Plus scanners, providing geometric precision of 10–90 μm and point densities of 30–80 pts/mm² (Ben-Hamadou et al., 2022).

Annotation followed an eight-step human–machine hybrid pipeline: artifact removal, normalization (PCA alignment to the occlusal plane), per-tooth cropping, UV-parameterization, 2D polygonal boundary drawing, back-projection to 3D, FDI two-digit labeling (indices 11–48; gingiva as “0”), and clinical validation by experienced orthodontists and dental surgeons. Iterative correction loops targeted missing labels and boundary inconsistencies (Ben-Hamadou et al., 2022).

No dedicated validation split is provided; the typical protocol is 1,200 scans for training and 600 for testing, with per-tooth ground truth throughout (Ben-Hamadou et al., 2023).

2. Supported Tasks and Benchmark Protocols

Teeth3DS supports the following tasks:

Task Input/Output Specification Principal Metric(s)
Teeth Detection Locate 3D centroids of each tooth AP(τ)\mathrm{AP}(\tau) (centroid @ τ)
Teeth Segmentation Vertex-wise tooth/gingiva instance masks IoU, Dice
Teeth Labeling Assign FDI index per detected tooth Overall accuracy, F1-score
3D Tooth Modeling Reconstruct crown mesh from incomplete scan Symmetric Chamfer Distance
Dental Landmark Identification Predict anatomical points on teeth Mean Euclidean error

Metrics are standardized: segmentation IoU and Dice, labeling accuracy, Chamfer Distance for modeling, and Euclidean distance for landmarking. The modeling task—often framed as completion or reconstruction—involves inferring the full-crown geometry from partial or locally occluded scan regions, a key use-case for retrieval-augmented and memory-guided methods (Ben-Hamadou et al., 2022, Sun et al., 3 Dec 2025).

3. Completion Benchmark: Protocols and Simulation of Missing Data

A prominent extension of Teeth3DS, validated in recent studies, is the single-tooth completion (restoration) benchmark (Sun et al., 3 Dec 2025). This protocol, designed to test context-conditioned point cloud completion algorithms, operates as follows:

  • For each scan and target tooth, generate a pair (Pπ,Pgt)\left(P_{\pi}, P_\mathrm{gt}\right):
    • PgtP_\mathrm{gt}: 2048-point cloud of all mesh vertices from the target tooth (after segmentation, normalization, and resampling).
    • PπP_{\pi}: the scan with all vertices of the target tooth removed, retaining (1) vertices of the two immediately adjacent teeth, and (2) the nn nearest gingiva points to allow resampling to 2048 points via farthest-point sampling plus duplication.
  • Preprocessing:
    • Zero-mean translation and unit-scale normalization (largest spatial dimension scaled to 1).
    • Application of consistent farthest-point sampling to enforce fixed point count in both partial and complete clouds.
  • Mode of missingness:
    • Entire tooth (crown + root) is removed (“large missing region”).
    • Relative to the full local cloud, this typically represents 20–40% missing points (not precisely quantified).
  • Evaluation:

    • Symmetric Chamfer Distance (CD\mathrm{CD}), as per Fan et al. (2017):

    CD(Pgt,P^)=1PgtpPgtminqP^pq22+1P^qP^minpPgtqp22\mathrm{CD}(P_\mathrm{gt}, \widehat{P}) = \frac{1}{|P_\mathrm{gt}|} \sum_{p \in P_\mathrm{gt}} \min_{q \in \widehat{P}} \|p-q\|_2^2 + \frac{1}{|\widehat{P}|} \sum_{q \in \widehat{P}} \min_{p \in P_\mathrm{gt}} \|q-p\|_2^2 - F-score @1% of bounding-box diameter. - Earth Mover’s Distance is not reported. - Visualizations include overlayed reconstructions of incisors and molars, highlighting occlusal surface and interproximal geometry.

Train-validation-test splits are identical and strictly enforced for fair comparison, but exact sizes are not specified (Sun et al., 3 Dec 2025).

4. Relation to Other Dental 3D Benchmarks

Teeth3DS is distinguished from CBCT-based volumetric segmentation sets such as CTooth (Cui et al., 2022) and CTooth+ (Cui et al., 2022), which focus on tooth instance segmentation in tomographic volumes with evaluation using volumetric DSC, IoU, and surface distances (HD, ASSD). In contrast, Teeth3DS addresses:

  • High-resolution surface mesh/point cloud analysis.
  • Tasks involving fine-grained crown/gingiva segmentation, instance labeling, and anatomical landmark detection.
  • Realistic simulation of clinical challenges, including whole-tooth loss, crowded arches, and scanner-induced artifacts (Ben-Hamadou et al., 2022, Ben-Hamadou et al., 2023).

The benchmark fosters reproducibility through open-source protocols, data, and Dockerized evaluation containers, paralleling best practices set by other medical computer vision challenges (Ben-Hamadou et al., 2023).

5. Algorithms and Baseline Results

State-of-the-art algorithms evaluated on Teeth3DS include PointNet++, Point Transformer, DGCNN, and transformer-based architectures (e.g., TSegFormer, MeshSNet variants). In completion, recent work has introduced retrieval-augmented methods with memory modules—fusing global descriptors of partial inputs with manifold prototypes—demonstrating improved Chamfer Distance and morphological accuracy under large missing regions (Sun et al., 3 Dec 2025).

Key findings from challenge reports and leading papers (Ben-Hamadou et al., 2022, Ben-Hamadou et al., 2023):

  • The best detection accuracy (AP@2mm) and segmentation Dice scores reach 96.6% and 0.993, respectively.
  • Instance labeling accuracy and modeling performance show a gap between the best (91% labeling accuracy; lowest Chamfer Distance) and lower-tier methods (as low as 68.4% labeling, 0.896 segmentation Dice).
  • Top-performing pipelines deploy multi-scale local geometric encodings, boundary-focused optimization (contrastive learning, curvature-based losses), explicit instance separation, and task-geometric postprocessing (e.g., dental-arch aware grouping) (Ben-Hamadou et al., 2023, Xiong et al., 2023, Sun et al., 3 Dec 2025).

6. Limitations, Extensions, and Access

Limitations of the current Teeth3DS benchmark include:

  • Lack of ground-truth intra-tooth landmarks and dental pathologies.
  • No explicit validation set (recommended user cross-validation).
  • Moderate annotation uncertainty; inter-annotator agreement is not quantified.
  • Challenges remain for crowded, partially erupted, or pathologically altered teeth.

Extensions and recommendations:

  • Augment with intra-tooth landmarks, caries, restorations, and anomaly tags.
  • Incorporate multi-modal data (e.g., intraoral photographs, CBCT) and time-series for growth modeling.
  • Refine evaluation with advanced boundary quality metrics (Hausdorff, ASSD).
  • Pursue semi-supervised protocols and pretext self-supervision leveraging massive unlabeled teeth datasets.

Teeth3DS data (OBJ meshes + JSON annotation), challenge protocols, and code are distributed for academic use through open-access repositories (Ben-Hamadou et al., 2022).

7. Impact and Future Directions

Teeth3DS and its completion benchmark have catalyzed advances in 3D dental computer vision, establishing de facto standards for performance reporting and method comparison in intraoral scan analysis. The adoption of retrieval-augmented, context-aware point cloud models and transformer backbones is accelerating, driven in part by the large, clinically validated scope of Teeth3DS (Xiong et al., 2023, Sun et al., 3 Dec 2025).

Future trajectories include robustification to clinical edge-cases (e.g., trauma, orthodontics), incorporation of root/anomaly segmentation, integration with prosthetic/CAD-CAM workflows, and migration toward federated and privacy-preserving benchmarking across international 3D scan repositories.


References:

(Ben-Hamadou et al., 2022, Ben-Hamadou et al., 2023, Xiong et al., 2023, Sun et al., 3 Dec 2025, Cui et al., 2022, Cui et al., 2022)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to Teeth3DS Benchmark.