Retinal Image Segmentation with Small Datasets

Published 9 Mar 2023 in eess.IV and cs.CV | (2303.05110v1)

Abstract: Many eye diseases like Diabetic Macular Edema (DME), Age-related Macular Degeneration (AMD), and Glaucoma manifest in the retina, can cause irreversible blindness or severely impair the central version. The Optical Coherence Tomography (OCT), a 3D scan of the retina with high qualitative information about the retinal morphology, can be used to diagnose and monitor changes in the retinal anatomy. Many Deep Learning (DL) methods have shared the success of developing an automated tool to monitor pathological changes in the retina. However, the success of these methods depend mainly on large datasets. To address the challenge from very small and limited datasets, we proposed a DL architecture termed CoNet (Coherent Network) for joint segmentation of layers and fluids in retinal OCT images on very small datasets (less than a hundred training samples). The proposed model was evaluated on the publicly available Duke DME dataset consisting of 110 B-Scans from 10 patients suffering from DME. Experimental results show that the proposed model outperformed both the human experts' annotation and the current state-of-the-art architectures by a clear margin with a mean Dice Score of 88% when trained on 55 images without any data augmentation.

Abstract PDF Upgrade to Chat

Citations (4)

View on Semantic Scholar

Summary

The paper introduces CoNet, a U-Net adaptation that segments retinal images accurately with only 55 training images.
It employs an ASPP block for multi-scale context integration while reducing convolutional depth to suit small datasets.
It achieved an 88% mean Dice Score on the Duke DME dataset, outperforming state-of-the-art models in fluid segmentation.

Retinal Image Segmentation with Small Datasets

Introduction

Retinal imaging, particularly through Optical Coherence Tomography (OCT), plays a critical role in diagnosing and monitoring various eye diseases like Diabetic Macular Edema (DME), Age-related Macular Degeneration (AMD), and glaucoma, which can lead to irreversible vision loss. Despite advancements in deep learning (DL) techniques for automating these analyses, they typically require extensive datasets to train models effectively. The focus of this study is presenting a robust methodology to perform retinal image segmentation using a minimal amount of training data—only 55 images—without data augmentation.

Background

Retinal OCT is a high-resolution, non-invasive imaging technique providing detailed cross-sections (B-scans) of the retina, facilitating the analysis of structural changes indicative of diseases. Historically, segmentation methods in this domain have evolved from traditional techniques, like graph-based methods and active contours, to more sophisticated approaches involving CNNs and U-Net architectures. Such DL models have shown promise, particularly in segmenting complex anatomical structures and fluid accumulations, which are critical in disease progression analysis.

Methodology

The proposed method, named CoNet, is an adaptation of the U-Net architecture designed specifically for small datasets. It comprises several key components: an encoding path, bottleneck structure, Atrous Spatial Pyramid Pooling (ASPP) block, and a decoding path. Notably, the CoNet architecture modifies conventional practices by incorporating multi-scale context information through the ASPP block, enhancing segmentation accuracy even with limited data.

Figure 1: The ASPP block in the proposed model captures multi-scale information by applying multiple parallel filters with different frequencies.

CoNet utilizes a shallower network architecture, reducing from 5 to 3 convolutional blocks, optimizing for the small dataset size while maintaining the complexity needed to capture nuanced retinal features. The encoding path emphasizes spatial localization through convolutional layers with specific kernel sizes tailored to the B-scan's geometry, whereas the decoding path focuses on precision in pixel-level classification.

Experiments and Results

The study employed the Duke DME dataset, which contains 110 B-scans from patients with severe DME, annotated by experts to ensure reliable ground truth segmentation. Training was conducted using 55 B-scans, adhering to a systematic K-fold cross-validation strategy to ensure robust evaluation.

Figure 2: Annotation and labelling of the 10 segments (7 retinal layers, 2 backgrounds, and 1 fluid) in the Duke DME dataset.

CoNet demonstrated superior performance metrics. The model achieved a mean Dice Score of 88%, surpassing both human expert annotations and the state-of-the-art ReLayNet, particularly excelling in the segmentation of challenging fluid regions, which often exhibit significant variability.

Figure 3: Examples to illustrate the visualization output of U-Net, ReLayNet, and the proposed CoNet, in order of the inputs, annotations, and outputs with orange arrows to demonstrate fine details picked up by the models.

Conclusion

The CoNet model exhibits significant potential in automated retinal image analysis, especially for applications constrained by limited data availability. Its architecture, featuring an ASPP block and a reduced convolutional depth, adapts well to the challenges of small datasets without compromising accuracy. This model sets a precedent for future exploration into similarly constrained medical imaging fields. Further advancements may involve extending CoNet to three-dimensional networks or applying it to diverse retinal pathologies to validate its applicability and resilience across varied diagnostic challenges.

Markdown