JavisInst-OMNI: Malocclusion Diagnosis Data

Updated 9 February 2026

JavisInst-OMNI is a multi-view dataset of 4,166 intra-oral images from 384 patients, providing high-quality, real-world data for malocclusion diagnosis.
The dataset features rigorous annotation of ten clinically relevant malocclusion categories using a standardized protocol aligned with ABO guidelines.
Benchmark results across CNN, Transformer, and graph-based models demonstrate its ability to drive advancements in automated orthodontic image analysis.

The JavisInst-OMNI (OMNI) dataset is a publicly released multi-view RGB image collection specifically curated to advance automated malocclusion diagnosis in orthodontics. Encompassing 4,166 images from 384 participants across five intra-oral views, and annotated in ten clinically relevant malocclusion categories by dental professionals, OMNI constitutes the first large-scale, high-quality dataset of its kind for oral and maxillofacial imaging. Its design, annotation methodology, and extensive benchmark baselines provide a robust foundation for research in dental image analysis (Xue et al., 21 May 2025).

1. Dataset Composition and Acquisition Protocol

Participants and Imaging:

OMNI comprises images from 384 Chinese patients (153 male, 231 female; ages 3–48, $\mu = 10.4$ years, $\sigma = 5.63$ years), captured at the Department of Stomatology, Third Affiliated Hospital of Soochow University. All participants provided informed consent under the Declaration of Helsinki. The acquisition protocol required routine intra-oral cleaning, standardized retractors, and a Canon EOS 550D digital camera (manual exposure, flash at ¼-stop, natural light curing lamp), with the camera orthogonally aligned at $90^\circ$ to the dental surface for each view.

View Distribution:

Frontal occlusal: 903 images
Left buccal occlusal: 841 images
Right buccal occlusal: 843 images
Maxillary (upper arch) occlusal: 820 images
Mandibular (lower arch) occlusal: 759 images

The protocol involved specific positioning and combinations of lip hooks, intra-oral mirrors, and patient head tilts to ensure comprehensive visual coverage of the dental arches.

2. Annotation Framework

Pre-processing and Quality Assurance:

All images underwent a pre-selection phase, with exclusion criteria targeting motion blur and incomplete dental structures (e.g., occlusion gaps). Teeth localization was conducted using the Makesense.ai platform via dense bounding boxes on each visible tooth.

Diagnostic Categories:

Annotations followed the 7th edition of “Orthodontics” and ABO guidelines, assigning one or more of the following ten labels per image:

HT: Healthy Teeth
TT: Tooth Torsion
DO: Deep Overjet
IOA: Invisible Orthodontic Attachment
TE: Tooth Emergence
CFOA: Cast Fixed Orthodontic Appliance
TM: Tooth Misalignment
MR: Mandibular Retrusion
OB: Orthodontic Brace
FOD: Fixed Orthodontic Device

Review Process:

Initial annotation was by trained dentists. A two-stage audit, performed by senior orthodontists per a standardized checklist (ABO classification), enforced consistency. Discrepancies were adjudicated by a dental specialist panel. Final annotations adopted the COCO format. No formal inter-annotator agreement metric was reported; multi-stage review replaced statistical agreement measures.

3. Dataset Organization and Statistics

Data Splits:

Images are partitioned as follows:

Split	Frontal	Left	Right	Maxillary	Mandibular	Total
Training	534	497	503	492	455	2,481
Validation	187	174	174	167	155	857
Test	182	170	166	161	149	828

Class Distribution:

Among 4,166 images:

Label	Count
HT	3,610
TT	1,686
DO	205
IOA	402
TE	441
CFOA	144
TM	147
MR	289
OB	220
FOD	776

On average, each image presents 1.03 diagnostic issues. Of the images, 565 are healthy, 2,945 contain one issue, 603 depict two, and 53 display three simultaneous issues.

4. Baseline Architectures and Experimental Protocol

Model Selection and Implementation:

Six object detection baselines were established, spanning CNN, Transformer, and GNN/GNN-hybrid models. All were implemented in PyTorch/MMDetection, trained on NVIDIA RTX 4090 GPUs, with AdamW optimizer (learning rate $=1\times10^{-4}$ , weight decay $=0.05$ ), 50 training epochs for most models (exceptions noted below), standardized image pre-processing (resize, random horizontal flip $p=0.5$ , ImageNet normalization):

CNN-Based:
- Faster R-CNN (ResNet-50/101 backbone)
- Mask R-CNN (ResNet-50/101, segmentation head disabled)
- EfficientDet (EfficientNet-B0/B3 backbone, BiFPN)
Transformer-Based:
- DETR (ResNet-50 backbone; 6 encoder and 6 decoder layers; trained 300 epochs)
- Deformable DETR (ResNet-50, deformable attention; 50 epochs)
Graph-Based:
- GraphTeethNet: Faster R-CNN backbone up to ROIAlign, up to 50 tooth proposals per image, ROI pooled features as node features; edge features modeled by Maxillofacial Teeth Representation Modeling (MTRM) and Teeth Relationship Modeling (TRM) modules through cross-attention; output graph $G^0=(V^0, E^0)$ classified in a GNN, trained 100 epochs.

5. Benchmarking Results and Evaluation Metrics

Performance was assessed using the mean Average Precision (mAP) metric, averaged across standard IoU thresholds $[0.50:0.05:0.95]$ , with common special cases at [email protected] and [email protected]. Definitions provided:

$\mathrm{AP} = \sum_n (R_n - R_{n-1}) \cdot P_n, \quad \mathrm{mAP} = \frac{1}{C} \sum_{i} \mathrm{AP}_i$

where $P_n$ and $R_n$ are the precision and recall at the $n$ th confidence threshold and $C = 10$ categories.

Key results ([email protected], in %):

Model	[email protected]
Deformable DETR	66.39
Mask R-CNN	65.88
Faster R-CNN	65.32
GraphTeethNet	63.89
EfficientDet (B3)	64.16
DETR	63.39

Deformable DETR demonstrated superior performance, especially for Mandibular Retrusion, Fixed Orthodontic Device, and Healthy Teeth (MR: 93.64, FOD: 93.60, HT: 89.94). EfficientDet yielded the highest mAP at high IoU ([email protected] = 41.13). GraphTeethNet substantiated the importance of relational modeling among teeth, with a mAP gain of 1.51 over its variant without edge modeling.

Metrics such as accuracy, precision, recall, and $F_1$ were not reported due to the task's object-detection rather than image-level classification setup.

6. Accessibility, Data Provenance, and Use

The entire OMNI dataset (frequently referenced as JavisInst-OMNI in code) and all benchmark implementations, including images, COCO-format annotations, training scripts, Dockerfile, and environment.yml (for PyTorch 1.x and MMDetection 2.x reproducibility), are publicly available at https://github.com/RoundFaceJ/OMNI (Xue et al., 21 May 2025).

Open access facilitates reproducibility, secondary analyses, and benchmarking studies, furthering methodological advances in ML-based dental diagnostics. The role of professional annotation, standardized review, and clinically grounded labels underscores the dataset's research reliability for downstream automation in orthodontic assessment.

7. Significance and Research Implications

The OMNI dataset addresses a critical shortage of large-scale, well-annotated datasets for malocclusion assessment, historically a limiting factor in dental image analysis and the development of automated diagnostic tools. The multi-view, multi-diagnosis character, strict acquisition protocol, and expert-driven annotation pipeline collectively position OMNI as a key benchmark for evaluating object detection, segmentation, and relational reasoning algorithms in dental imaging.

A plausible implication is that OMNI may accelerate research into fine-grained intra-oral disease recognition, cross-view aggregation, structured prediction, and robust clinical deployment of machine learning models in orthodontics, setting a new standard for community benchmarks in this domain (Xue et al., 21 May 2025).

Markdown Report Issue Upgrade to Chat

References (1)

Oral Imaging for Malocclusion Issues Assessments: OMNI Dataset, Deep Learning Baselines and Benchmarking (2025)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to JavisInst-Omni Dataset.

JavisInst-OMNI: Malocclusion Diagnosis Data

1. Dataset Composition and Acquisition Protocol

Participants and Imaging:

View Distribution:

2. Annotation Framework

Pre-processing and Quality Assurance:

Diagnostic Categories:

Review Process:

3. Dataset Organization and Statistics

Data Splits:

Class Distribution:

4. Baseline Architectures and Experimental Protocol

Model Selection and Implementation:

5. Benchmarking Results and Evaluation Metrics

6. Accessibility, Data Provenance, and Use

7. Significance and Research Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

JavisInst-OMNI: Malocclusion Diagnosis Data

1. Dataset Composition and Acquisition Protocol

Participants and Imaging:

View Distribution:

2. Annotation Framework

Pre-processing and Quality Assurance:

Diagnostic Categories:

Review Process:

3. Dataset Organization and Statistics

Data Splits:

Class Distribution:

4. Baseline Architectures and Experimental Protocol

Model Selection and Implementation:

5. Benchmarking Results and Evaluation Metrics

6. Accessibility, Data Provenance, and Use

7. Significance and Research Implications

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research