mAIcetoma: AI for Automated Mycetoma Diagnosis
- The mAIcetoma challenge showcases automated segmentation of mycetoma grains and binary etiological classification using advanced AI models.
- The MyData dataset, curated from 142 patients and 864 expert-annotated images, establishes a standardized benchmark for digital pathology in neglected tropical diseases.
- Innovative methodologies combining encoder-decoder CNNs, transformer architectures, and rigorous data augmentation achieved Dice scores over 0.88 and high classification accuracy.
Mycetoma MicroImage: Detect and Classify Challenge (mAIcetoma) is a competitive platform for the development and evaluation of AI models targeting the automated segmentation of mycetoma grains and classification of disease etiology (fungal vs. bacterial) in histopathological images. Mycetoma is a chronic, neglected tropical disease that disproportionately affects rural, low-resource populations and presents diagnostic difficulties due to the reliance on expert pathologists. The mAIcetoma challenge leverages the MyData dataset, the first publicly standardized, expert-annotated collection of mycetoma histopathology slides, to benchmark state-of-the-art models and spur progress in digital pathology solutions for endemic regions (Ali et al., 2024, &&&1&&&).
1. Motivation and Clinical Significance
Mycetoma, characterized by subcutaneous granulomatous infection, can be caused by fungi (eumycetoma) or filamentous bacteria (actinomycetoma). Correct etiological diagnosis is essential for directing therapy: antifungals for eumycetoma, antibiotics for actinomycetoma. Traditional workflows in endemic settings rely primarily on hematoxylin and eosin (H&E)-stained tissue and expert histopathological review, processes that are protracted and error-prone in environments with limited specialist capacity. mAIcetoma addresses these constraints via automated, standardized machine learning pipelines to make rapid, consistent diagnosis feasible at scale (Ali et al., 2024, Ali et al., 25 Dec 2025).
2. Dataset: MyData Structure and Curation
The MyData resource, curated by the Mycetoma Research Centre in Khartoum, provides the foundation for the challenge (Ali et al., 2024). The dataset features:
- Cohort: 142 patients (post-exclusion of non-granular slides), 864 RGB images (471 eumycetoma, 393 actinomycetoma). Acquisition spanned ~5 years.
- Imaging: Nikon Eclipse 80i, H&E, 10× objective, JPEG (800×600 px), 24-bit RGB color.
- Annotations: Binary masks delineating every grain, manually traced by expert pathologists using ImageJ and adhering to strict inclusion/exclusion protocols. No digital tiling; ground-truth masks in TIFF.
- Species & Metadata: Detailed breakdown at both species and patient level, e.g., Madurella spp., Madurella mycetomatis (positive/negative), Aspergillus spp., Fusarium spp., Actinomadura pelletieri, Actinomadura madurae, Streptomyces somaliensis. Clinical metadata includes patient age, sex, infection site, lesion duration/size, with geographic origin predominantly at the Sudanese epicenter.
Minimal digital pre-processing (color balancing/enhancement only) was performed before annotation. Inter-annotator variability studies are planned, but not yet completed.
3. Challenge Design and Task Definition
The mAIcetoma challenge comprises two principal tasks (Ali et al., 25 Dec 2025):
- Task 1: Grain Segmentation
- Input: H&E microscopic image
- Output: Pixel-wise mask (grain vs. background)
- Task 2: Etiological Classification
- Input: Segmented grain or full image
- Output: Binary label (eumycetoma vs. actinomycetoma)
Recommended data splits ensure that no patient appears in more than one subset (e.g., 65% train, 15% validation, 20% test used in the competition phase), enforcing robust generalization. Teams performed initial pre-processing such as removal of duplications/inconsistent masks, intensity normalization, resizing (to 512×512 or 640×896), and data augmentation (rotations, flips, color jitter, stain normalization).
Benchmark tasks leverage both detection (localizing grains) and classification (etiological type) consistent with histopathological workflows, providing granular (pixel-level) and diagnostic (image-level) performance metrics.
4. Methodologies and Model Architectures
Winning submissions predominantly employed advanced encoder-decoder schemes for segmentation and deep convolutional neural networks (CNNs) for classification, supplemented by architectural innovations and rigorous data augmentation (Ali et al., 25 Dec 2025).
4.1 Segmentation Architectures
- Adrian: FPN encoder with Mix Vision Transformer (MiT) decoder, ensembled across six models (1st place).
- Tiger: DeepLabV3+ (ResNet-50 encoder, ASPP module, upsampling decoder), cross-knowledge distillation (2nd place).
- Macaroon: Ensemble of four U-Net variants (3rd place), cascaded strategy: segment → crop ROI → classify → refine segmentation.
- VSI: nnU-Net-based 8-stage PlainConvUNet, Conditional Random Fields (CRF) for post-processing.
- Minions: U-Net variant with dual-head (segmentation and coarse class prediction for multi-task learning).
Losses included binary cross-entropy (BCE), Dice loss, and combined loss terms to optimize both overlap and class discrimination:
4.2 Classification Models
- Ensembles of EfficientNet-B0, ResNet50 (fine-tuned), DenseNet-121/169, and custom MLP heads were utilized.
- Input modalities included segmented grain crops or entire images.
- Some teams (Minions) experimented with explicit feature engineering (color, texture, shape), though deep RGB-based models achieved equivalent or superior results.
- Training optimizers included Adam and AdEMAMix, with data augmentation critical for robust performance.
4.3 Alternative and Featurized Approaches
- Fisher-Vector based bag-of-words deep feature pipelines from prior fungi image work are directly adaptable to mAIcetoma, leveraging CNN-extracted descriptors pooled by a Gaussian Mixture Model to yield high-discriminative scan-wise features (Zieliński et al., 2020).
5. Evaluation Protocols and Performance Metrics
Formal quantitative evaluation encompassed both segmentation and classification using standard morphometric and diagnostic metrics (Ali et al., 2024, Ali et al., 25 Dec 2025):
- Segmentation: Dice coefficient, Intersection over Union (IoU), sensitivity, specificity, and accuracy, all at the pixel level.
- Classification: Accuracy, sensitivity (recall), specificity, F1-score, and Matthews Correlation Coefficient (MCC), typically computed at the per-image or per-grain level.
Definitions:
Weighted aggregation of metrics defined the final challenge ranking (e.g., for segmentation, weighted sum of sensitivity/specificity/accuracy/Dice; for classification, weighted sum of sensitivity/specificity/accuracy/MCC).
Summary of Top Team Results
| Team | Segmentation Dice | Classification ACC | Segmentation Weighted | Classification Weighted |
|---|---|---|---|---|
| Adrian | 0.8820 | 0.9726 | 93.56% | 96.14% |
| Macaroon | 0.8439 | 0.9521 | 92.52% | 93.11% |
| Tiger | 0.8646 | 0.8699 | 91.98% | 82.91% |
Ensemble and transformer-based approaches produced the strongest segmentation and classification results. Cascaded segment→crop→classify pipelines improved robustness by reducing background variability.
6. Model Robustness, Generalization, and Challenge Pitfalls
Significant class imbalance exists (e.g., Streptomyces somaliensis overrepresentation within actinomycetoma), while general patient demographics skew toward young adult males with lower limb infection. The Sudanese cohort, localized to the Khartoum referral center, poses a risk for domain overfitting. Recommended mitigations include:
- Oversampling and class-balanced sampling, focal loss, and SMOTE at patch or feature level for rare classes.
- Augmentation with geometric, photometric, and stain-specific perturbations (e.g., Reinhard/Macenko stain normalization, artifact simulations).
- Encouragement for external validation and submission of geographically and etiologically diverse slides to test generalization (Ali et al., 2024).
Teams reported that meticulous data cleaning, consistency checking, and mask correction yielded substantial gains even for compact architectures. Domain-specific augmentations and post-processing (e.g., CRF boundaries) improved fine grain boundary prediction.
7. Insights, Impact, and Future Directions
Automated segmentation of mycetoma grains (Task 1) is essential for reliable downstream etiological classification. All finalist models achieved >0.88 Dice, indicating the feasibility of robust grain detection from standard H&E images. Compact models, if tuned appropriately, rival the performance of larger backbones, facilitating low-resource deployment, including potential smartphone or offline inference (Ali et al., 25 Dec 2025).
Recommendations for future work include:
- Expansion of the MyData dataset with multi-center cohorts and additional neglected tropical disease (NTD) classes for differential diagnosis.
- Systematic inter- and intra-annotator reliability studies.
- Exploration of self-supervised histopathology pretraining and explainable AI overlays for clinical interpretability.
- Development of integrated diagnostic and prognostic models incorporating histological, clinical, and demographic data.
The mAIcetoma challenge has demonstrated that the integration of AI into histopathological workflows for mycetoma can accelerate diagnosis, enable scalable expert-level performance in low-resource regions, and serve as a template for similar efforts in other NTDs (Ali et al., 2024, Ali et al., 25 Dec 2025, Zieliński et al., 2020).