Multi-organ Segmentation over Partially Labeled Datasets with Multi-scale Feature Abstraction (2001.00208v2)

Published 1 Jan 2020 in cs.CV

Abstract: Shortage of fully annotated datasets has been a limiting factor in developing deep learning based image segmentation algorithms and the problem becomes more pronounced in multi-organ segmentation. In this paper, we propose a unified training strategy that enables a novel multi-scale deep neural network to be trained on multiple partially labeled datasets for multi-organ segmentation. In addition, a new network architecture for multi-scale feature abstraction is proposed to integrate pyramid input and feature analysis into a U-shape pyramid structure. To bridge the semantic gap caused by directly merging features from different scales, an equal convolutional depth mechanism is introduced. Furthermore, we employ a deep supervision mechanism to refine the outputs in different scales. To fully leverage the segmentation features from all the scales, we design an adaptive weighting layer to fuse the outputs in an automatic fashion. All these mechanisms together are integrated into a Pyramid Input Pyramid Output Feature Abstraction Network (PIPO-FAN). Our proposed method was evaluated on four publicly available datasets, including BTCV, LiTS, KiTS and Spleen, where very promising performance has been achieved. The source code of this work is publicly shared at https://github.com/DIAL-RPI/PIPO-FAN for others to easily reproduce the work and build their own models with the introduced mechanisms.

Citations (160)

View on Semantic Scholar

Summary

The paper introduces a novel PIPO-FAN architecture that leverages multi-scale feature abstraction to improve segmentation accuracy on partially labeled datasets.
It employs an equal convolutional depth mechanism combined with deep supervision and adaptive fusion to effectively integrate features across multiple scales.
Benchmark results demonstrate competitive Dice scores across various organs on public datasets, highlighting the clinical potential of the proposed approach.

Multi-organ Segmentation over Partially Labeled Datasets with Multi-scale Feature Abstraction

This paper addresses the challenge of automatically segmenting multiple organs in medical images using deep learning models, specifically when only partially labeled datasets are available. The authors propose a unified training strategy that leverages multi-scale feature abstraction through a novel deep neural network architecture, the Pyramid Input Pyramid Output Feature Abstraction Network (PIPO-FAN).

Key Contributions and Methodology

Multi-scale Feature Abstraction: Recognizing the limitations of current deep learning models due to the shortage of fully annotated datasets, the authors propose a multi-scale network architecture that effectively accesses and integrates features from multiple scales of input data. This architecture employs a U-shape pyramid structure to facilitate hierarchical feature extraction, thereby capturing detailed texture information along with broader contextual data.
Equal Convolutional Depth Mechanism: One of the critical advancements of this work is the introduction of an equal convolutional depth mechanism. This approach addresses the semantic gap inherent in directly merging features from different scales by ensuring that all scale levels undergo a consistent number of convolutional operations.
Deep Supervision and Adaptive Fusion: To refine outputs at various scales, the paper introduces a deep supervision mechanism which enhances the representation power by guiding learning at each scale level. This is complemented by an adaptive weighting mechanism that fuses output features automatically, emphasizing more informative scales based on the input data context.
Unified Training over Partially Labeled Datasets: The authors devise a target adaptive loss that enables effective network training across datasets with inconsistent annotations, thereby amalgamating information from different datasets into a single, robust model architecture. This adaptation opens avenues for utilizing a broader array of available data, enhancing model robustness and segmentation accuracy.

Results and Implications

The proposed approach was benchmarked against state-of-the-art networks across four publicly available datasets: BTCV, LiTS, KiTS, and Spleen datasets, demonstrating improved segmentation accuracy. Notably, the PIPO-FAN achieved competitive Dice scores across various organs, illustrating its efficacy in leveraging partially labeled datasets to achieve multi-organ segmentation.

Future Prospects

The implications of this work suggest potential advancements in developing more generalized models capable of handling diverse medical imaging tasks without the stringent requirement for fully annotated datasets. By effectively utilizing partial annotations, the proposed techniques could drastically reduce the annotation burden in clinical settings, thus expediting the deployment of AI-driven diagnostic tools.

Further exploration could involve integrating this multi-scale and multi-dataset training approach with semi-supervised or transfer learning techniques, potentially enhancing its capacity to generalize across various imaging modalities and clinical applications. Additionally, there may be potential collaborations between research institutions to curate partial datasets, enabling broader application scopes and further validating the proposed methodologies under diverse clinical conditions.

PDF Markdown

Related Papers

GitHub

GitHub - DIAL-RPI/PIPO-FAN: PIPO-FAN for multi organ segmentation over partial labeled datasets using pytorch (62 stars)