CNN-based Segmentation of Medical Imaging Data (1701.03056v2)

Published 11 Jan 2017 in cs.CV

Abstract: Convolutional neural networks have been applied to a wide variety of computer vision tasks. Recent advances in semantic segmentation have enabled their application to medical image segmentation. While most CNNs use two-dimensional kernels, recent CNN-based publications on medical image segmentation featured three-dimensional kernels, allowing full access to the three-dimensional structure of medical images. Though closely related to semantic segmentation, medical image segmentation includes specific challenges that need to be addressed, such as the scarcity of labelled data, the high class imbalance found in the ground truth and the high memory demand of three-dimensional images. In this work, a CNN-based method with three-dimensional filters is demonstrated and applied to hand and brain MRI. Two modifications to an existing CNN architecture are discussed, along with methods on addressing the aforementioned challenges. While most of the existing literature on medical image segmentation focuses on soft tissue and the major organs, this work is validated on data both from the central nervous system as well as the bones of the hand.

Citations (397)

View on Semantic Scholar

Summary

The paper's main contribution is a modified U-Net architecture that uses multi-scale segmentation and concatenation for superior feature integration.
It addresses challenges like class imbalance and limited labeled data using a Jaccard index-based loss function.
Results on brain and hand MRI datasets show that incorporating multiple imaging modalities improves both segmentation accuracy and convergence speed.

Insights into CNN-based Segmentation of Medical Imaging Data

The paper, "CNN-based Segmentation of Medical Imaging Data," explores the efficacy of Convolutional Neural Networks (CNNs) in the domain of medical image segmentation, specifically focusing on brain and hand MRI images. This exploration builds on the recent advances in semantic segmentation and applies them to medical data, which poses unique challenges such as the scarcity of labeled data, class imbalance, and the substantial memory demand of 3D images.

Overview of CNN Architecture and Adaptations

The authors leverage a U-Net-like architecture, initially introduced by Ronneberger et al., incorporating two specific modifications. First, they combine multiple segmentation maps crafted at various scales, which can potentially ameliorate convergence speed, although it doesn't necessarily enhance final performance. This approach is similar to techniques that are prevalent in fully convolutional networks (FCNs). Second, a comparison is drawn between using element-wise summation versus concatenation to forward feature maps through long skip connections. The findings reveal that concatenation yields better results, likely due to better feature distinction and integration.

The authors highlight the advantages of this framework in tackling class imbalance, which is a significant concern in medical imaging. Unlike some segmentation approaches that rely heavily on class-weighted loss functions, this work employs a loss function predicated on the Jaccard similarity index. This choice reflects a subtle yet impactful shift towards accommodating imbalanced datasets, minimizing hyperparameter dependencies.

Application and Results

The paper validates its proposed methodology on two distinct medical imaging tasks: hand and brain MRI. For brain MRI, the paper is executed on the BRATS dataset, which consists of images with high and low-grade gliomas across multiple modalities (Flair, T1, T1C, and T2). This dataset's diverse nature showcases the model’s adaptability to various combinations of input modalities, enhancing the segmentation accuracy across different tumor regions.

The experimentation delineates several key observations:

The concatenation approach for skip connections outperforms summation due to its ability to preserve and utilize separate streams of local and global features.
The use of multi-scale segmentation outputs facilitates faster convergence, demonstrating an operational advantage even if the final segmentation accuracy remains comparable.
In the segmentation of brain tumors, the inclusion of multiple imaging modalities significantly influences the segmentation performance, with Flair and T1C proving particularly beneficial.

Implications and Future Directions

The research illuminates several critical facets in the field of CNN-based medical image segmentation. The proposed methodology's scalability across various segmentation tasks signals its potential for broader application across different types of medical images and imaging technologies (e.g., CT or ultrasound).

Two avenues present themselves for future exploration: integrating healthy brain scans into the dataset to mitigate false positives observed in synthetic healthy brain images, and further optimizing loss functions tailored for medical image segmentation tasks. These steps could bolster segmentation performance, especially in datasets exhibiting severe class imbalance or sparsity in available labeled data.

The paper’s analysis underscores the necessity of strategic architectural enhancements and methodological refinements in tackling the inherent challenges in 3D medical imaging, paving the way for more advanced and precise segmentation solutions.

PDF Markdown