CE-Net: Context Encoder Network for 2D Medical Image Segmentation (1903.02740v1)

Published 7 Mar 2019 in cs.CV

Abstract: Medical image segmentation is an important step in medical image analysis. With the rapid development of convolutional neural network in image processing, deep learning has been used for medical image segmentation, such as optic disc segmentation, blood vessel detection, lung segmentation, cell segmentation, etc. Previously, U-net based approaches have been proposed. However, the consecutive pooling and strided convolutional operations lead to the loss of some spatial information. In this paper, we propose a context encoder network (referred to as CE-Net) to capture more high-level information and preserve spatial information for 2D medical image segmentation. CE-Net mainly contains three major components: a feature encoder module, a context extractor and a feature decoder module. We use pretrained ResNet block as the fixed feature extractor. The context extractor module is formed by a newly proposed dense atrous convolution (DAC) block and residual multi-kernel pooling (RMP) block. We applied the proposed CE-Net to different 2D medical image segmentation tasks. Comprehensive results show that the proposed method outperforms the original U-Net method and other state-of-the-art methods for optic disc segmentation, vessel detection, lung segmentation, cell contour segmentation and retinal optical coherence tomography layer segmentation.

Authors (9)

Zaiwang Gu (11 papers)
Jun Cheng (108 papers)
Huazhu Fu (185 papers)
Kang Zhou (74 papers)
Huaying Hao (5 papers)
Yitian Zhao (34 papers)
Tianyang Zhang (27 papers)
Shenghua Gao (84 papers)
Jiang Liu (143 papers)

Citations (1,496)

View on Semantic Scholar

Summary

The paper presents a novel architecture that enhances 2D medical image segmentation by integrating advanced context encoding modules.
It leverages a pretrained ResNet-34 with Dense Atrous Convolution and Residual Multi-kernel Pooling to capture multi-scale features.
Experimental results show improved accuracy and sensitivity across various tasks, demonstrating its potential for more precise medical analysis.

Overview of "CE-Net: Context Encoder Network for 2D Medical Image Segmentation"

The paper "CE-Net: Context Encoder Network for 2D Medical Image Segmentation" introduces a novel deep learning-based architecture, CE-Net, designed to enhance the performance of 2D medical image segmentation tasks. The proposed CE-Net aims to address the limitations of the traditional U-Net architecture by incorporating advanced feature extraction and contextual information retention mechanisms.

Key Components of CE-Net

The architecture of CE-Net is composed of three main components:

Feature Encoder Module: This module leverages a pretrained ResNet-34 backbone to extract features from input images. The incorporation of ResNet, known for its residual learning capabilities, addresses the gradient vanishing problem and enables the extraction of more robust feature representations.
Context Extractor Module: This module includes two newly proposed blocks:
- Dense Atrous Convolution (DAC) Block: The DAC block uses atrous convolutions with varying dilation rates to capture multi-scale features. This block efficiently combines multiple receptive fields to capture wider and deeper semantic contexts, enhancing feature representation.
- Residual Multi-kernel Pooling (RMP) Block: The RMP block employs multi-size pooling operations to further gather context information. This block integrates features collected at various scales, aiding in the retention of spatial information without increasing the computational burden significantly.
Feature Decoder Module: This module aims to restore the spatial resolution of feature maps. It uses transposed convolutions and skip connections from the encoder to the decoder to preserve spatial details and improve the accuracy of the final segmentation masks.

Experimental Setup and Evaluation

The proposed CE-Net was evaluated on multiple datasets, covering various segmentation tasks including optic disc segmentation, retinal vessel detection, lung segmentation, cell contour segmentation, and retinal optical coherence tomography (OCT) layer segmentation. The evaluation metrics included overlapping error, sensitivity, accuracy, and the Dice coefficient, depending on the dataset and task.

Results and Implications

The CE-Net demonstrated superior performance across all tested datasets and tasks compared to several state-of-the-art methods:

Optic Disc Segmentation: On the ORIGA, Messidor, and RIM-ONE-R1 datasets, CE-Net achieved lower overlapping errors compared to established methods like U-Net and DeepDisc.
Retinal Vessel Detection: On the DRIVE dataset, CE-Net achieved higher sensitivity and accuracy compared to methods such as DeepVessel and HED.
Lung Segmentation: On the LUNA dataset, CE-Net outperformed U-Net in terms of overlapping error, sensitivity, and accuracy.
Cell Contour Segmentation: On the EM challenge dataset, CE-Net showed better performance in $V^{Rand}$ and $V^{Info}$ scores compared to U-Net and the residual-based backbone.
Retinal OCT Layer Segmentation: On the Topcon dataset, CE-Net achieved lower mean absolute errors in segmenting multiple retinal layers compared to SRR and U-Net.

The authors conducted ablation studies to validate the effectiveness of each component, revealing that both the DAC and RMP blocks significantly contributed to the enhanced performance of CE-Net. The inclusion of these blocks allowed for better contextual feature extraction and preservation of spatial details.

Practical and Theoretical Implications

Practical Implications: The improved segmentation performance of CE-Net suggests its potential for more accurate and reliable medical image analysis, which can enhance clinical decision-making processes. By providing more precise delineation of anatomical structures, CE-Net can be instrumental in applications ranging from disease diagnosis to treatment planning.

Theoretical Implications: CE-Net's novel integration of atrous convolutions and multi-kernel pooling into a deep learning framework demonstrates significant progress in addressing the trade-off between feature abstraction and resolution retention. This architectural innovation may inspire further research into multi-scale feature extraction and efficient pooling strategies in neural network design.

Future Directions

Looking ahead, there are several avenues for future research and development:

3D Medical Image Segmentation: Extending CE-Net to handle 3D medical data could address the segmentation needs in volumetric imaging modalities like CT and MRI.
Real-time Segmentation: Optimizing CE-Net for real-time applications could broaden its utility in clinical settings where rapid image analysis is required.
Generalization to Other Domains: Exploring the application of CE-Net to non-medical domains could validate the versatility and robustness of its architectural innovations.

In conclusion, CE-Net presents a substantial advancement in the field of medical image segmentation, providing a framework that captures high-level contextual information while maintaining spatial resolution, thus paving the way for more accurate and reliable image analysis in medical and potentially other domains.

PDF Markdown

Related Papers

Tweets

https://twitter.com/ArxivDocs/status/1798923254662340892