Papers
Topics
Authors
Recent
2000 character limit reached

AEyeDB Dataset for Retinal Deep Learning

Updated 30 November 2025
  • AEyeDB is a curated in-house fundus image dataset designed for retinal disease classification and anomaly detection with expert, image-level annotations.
  • It employs standardized preprocessing and augmentation methods, with geometric and color/blur techniques shown to improve model accuracy and AUC.
  • The dataset features strict train/validation/test splits and access protocols to support reproducible research while highlighting challenges in domain adaptation.

AEyeDB is a curated in-house fundus image collection designed to support the development and evaluation of deep learning algorithms for retinal disease classification and anomaly detection. It addresses challenges in reliable retinal disease detection posed by imaging variability, subtle pathological manifestations, and domain shift across heterogeneous datasets. AEyeDB provides researchers with a well-controlled, expertly labeled resource for benchmarking discriminative and anomaly-detection methodologies in ophthalmic computer vision (Ruhland et al., 23 Nov 2025).

1. Dataset Composition and Acquisition

AEyeDB was generated from a prospective paper at Heinrich Heine University Düsseldorf, enrolling 103 adult volunteers (21 female, 82 male; mean age 26.93 ± 7.18 years). Each participant underwent non-mydriatic color fundus photography using a Rodenstock FundusScope, with a native output resolution and an approximate 30° field of view. Initially, 256 images were acquired; after exclusion of duplicates and suboptimal-quality images (criteria: out of focus, excessive glare, poor centering), 204 remained for analysis. The cohort encompasses both healthy controls and individuals reporting systemic conditions with known retinal manifestations (13 subjects with self-reported diabetes or hypertension). No fine-grained disease severity stratification is provided.

The images are labeled into four diagnostic categories:

  • Healthy (normal fundus)
  • Diabetic retinopathy (DR)
  • Age-related macular degeneration (AMD)
  • Glaucoma

All images are provided in three-channel RGB format, with further center-cropping to the circular fundus region and resizing to 224×224 pixels for model input compatibility (Ruhland et al., 23 Nov 2025).

2. Labeling and Annotation Protocol

Image-level class labels in AEyeDB were assigned by board-certified ophthalmologists from University Hospital Knappschaftskrankenhaus Bochum. Each fundus photograph underwent independent evaluation by two specialists; discrepancies were adjudicated by a senior clinician to establish a consensus label. The annotation strictly provides image-level diagnostic categories—no lesion-localization, pixel-level, or severity-grade (e.g., ETDRS) annotation is present. These protocols facilitate image-level supervised classification and support one-class anomaly detection (healthy as the normal class).

3. Preprocessing and Augmentation Methodologies

All images in AEyeDB undergo standardized preprocessing:

  • Per-channel intensity normalization to zero mean and unit variance
  • Center-cropping limited to the retinal disc
  • Resizing to 224×224 pixels

Four augmentation regimes were systematically evaluated: (a) Geometric augmentations: random horizontal flips, arbitrary rotations (0°–360°), translations up to ±10% of image dimensions. (b) Color and blur augmentations: Gaussian blur (σ ∈ [0,1.0]), color-jittering (brightness/contrast ±20%, saturation ±15%). (c) Histogram equalization: per Kaur et al., global contrast normalization (HEQ). (d) Laplacian enhancement: edge emphasis using unsharp masking and Laplace filtering.

Empirically, geometric and color/blur augmentations yielded consistent improvements in model generalization. For example, mean accuracy on external test sets increased from 0.825 (baseline) to 0.843 (geometric) and 0.841 (color/blur), whereas histogram equalization and Laplacian enhancement resulted in decreases to 0.789 and 0.815, respectively. On the Papila dataset, geometric augmentation with a Vision Transformer (ViT) increased the AUC by ΔAUC ≈ 0.04 (0.87 to 0.91) over the baseline. Models trained on AEyeDB itself achieved 1.00 accuracy and 1.00 AUC on the held-out test set, reflecting its curated characteristics (Ruhland et al., 23 Nov 2025).

4. Splits and Evaluation Protocols

A stratified 70 / 15 / 15 train/validation/test split (143, 31, and 30 images, respectively) maintains proportionate class balance across the four label categories. For classification, the full dataset is employed; for anomaly detection, only healthy cases are utilized for one-class training. Cross-dataset evaluations are supported by combining the AEyeDB test partition with counterparts from FIVES, Mendeley, Papila, and Messidor to assess domain generalization and robustness to dataset bias. The perfect classification and discrimination metrics (accuracy and AUC of 1.00) on the AEyeDB test set underscore low intra-dataset variability and high acquisition quality but should not be taken as an indicator of generalization capability in diverse clinical scenarios.

Property Details
Subjects 103 (21 F, 82 M), mean age 26.93 ± 7.18 y
Images collected/used 256 / 204
Diagnostic categories Healthy, DR, AMD, Glaucoma
Camera Rodenstock FundusScope
Preprocessing Crop to disc, resize 224×224, normalization
Recommended augmentations Geometric, color/blur
Train/Val/Test split 143/31/30
Baseline ViT Accuracy 1.00
Baseline ViT AUC 1.00

5. Access, Usage Terms, and Ethical Considerations

AEyeDB is not available for public download. Access requires submitting a research proposal to the data custodian ([email protected]) referencing trial DRKS00033094. Agreement is contingent upon compliance with EU GDPR and German data protection regulations; execution of a Data Use Agreement specifying confidentiality, non-redistribution, and research-only use is mandatory. No licensing fees are imposed for academic research, but commercial exploitation necessitates additional negotiation. A plausible implication is that researchers should anticipate procedural lead time and ensure ethical approval aligns with local legal requirements.

6. Recommendations, Limitations, and Best Practices

AEyeDB’s controlled acquisition protocol and high image quality make it suitable for initial algorithm prototyping, hyperparameter tuning, and ablation studies, particularly on attention mechanisms and anomaly detection frameworks. Recommended best practices include:

  • Employing geometric and mild color augmentations to maximize generalizability.
  • Adhering to the 70 / 15 / 15 stratified split to facilitate reproducibility.
  • Augmenting the healthy subset with age-matched public controls when training anomaly detectors, given that stable GUESS score calibration benefits from ≥200 healthy samples.
  • Combining AEyeDB with glaucoma-focused datasets (e.g., Papila) for optic-nerve-head analyses due to the limited number and subtlety of glaucoma cases in AEyeDB.

Key limitations include the narrow age distribution (predominantly young adults), significant male predominance (≈80%), and absence of lesion-level or severity-grade annotations. Overinterpretation of absolute performance metrics is discouraged. If severity stratification is a paper aim, supplemental labeling is recommended.

7. Significance and Application to Ophthalmic Deep Learning

AEyeDB provides a foundational benchmark for discriminative and anomaly-detection methods in retinal imaging. Vision Transformers (ViTs) trained on AEyeDB demonstrate perfect separation of principal disease classes, supporting rapid architecture prototyping and ablation studies under high-fidelity conditions. When integrated with heterogeneous external datasets, AEyeDB contributes to the development of robust and generalizable models suitable for clinical translation (Ruhland et al., 23 Nov 2025). Its design aligns with the needs of reproducible research and systematic comparison, while highlighting ongoing challenges in domain adaptation and representative population sampling.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)
Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to AEyeDB Dataset.