Automatic Polyp Segmentation via Multi-scale Subtraction Network

Published 11 Aug 2021 in cs.CV | (2108.05082v1)

Abstract: More than 90\% of colorectal cancer is gradually transformed from colorectal polyps. In clinical practice, precise polyp segmentation provides important information in the early detection of colorectal cancer. Therefore, automatic polyp segmentation techniques are of great importance for both patients and doctors. Most existing methods are based on U-shape structure and use element-wise addition or concatenation to fuse different level features progressively in decoder. However, both the two operations easily generate plenty of redundant information, which will weaken the complementarity between different level features, resulting in inaccurate localization and blurred edges of polyps. To address this challenge, we propose a multi-scale subtraction network (MSNet) to segment polyp from colonoscopy image. Specifically, we first design a subtraction unit (SU) to produce the difference features between adjacent levels in encoder. Then, we pyramidally equip the SUs at different levels with varying receptive fields, thereby obtaining rich multi-scale difference information. In addition, we build a training-free network "LossNet" to comprehensively supervise the polyp-aware features from bottom layer to top layer, which drives the MSNet to capture the detailed and structural cues simultaneously. Extensive experiments on five benchmark datasets demonstrate that our MSNet performs favorably against most state-of-the-art methods under different evaluation metrics. Furthermore, MSNet runs at a real-time speed of $\sim$70fps when processing a $352 \times 352$ image. The source code will be publicly available at \url{https://github.com/Xiaoqi-Zhao-DLUT/MSNet}. \keywords{Colorectal Cancer \and Automatic Polyp Segmentation \and Subtraction \and LossNet.}

Abstract PDF Upgrade to Chat

Citations (188)

View on Semantic Scholar

Summary

The paper proposes an MSNet that leverages a subtraction unit for efficient, differential feature fusion in polyp segmentation.
It employs a hierarchical multi-scale approach with LossNet, achieving up to 14.1% improvement in mean IoU across benchmark datasets.
The model operates in real-time (~70fps), offering practical benefits for automated colonoscopy and early colorectal cancer detection.

Automatic Polyp Segmentation via Multi-scale Subtraction Network: An Expert Overview

The paper "Automatic Polyp Segmentation via Multi-scale Subtraction Network" addresses a critical challenge in medical imaging: the precise segmentation of polyps from colonoscopy images, which plays a pivotal role in the early detection and prevention of colorectal cancer. Colorectal cancer frequently originates from polyps, and early intervention is essential. Hence, developing reliable, automatic segmentation tools is a substantial clinical imperative.

Core Innovations and Methodology

The primary contribution of this study is the introduction of a Multi-scale Subtraction Network (MSNet) for polyp segmentation. The MSNet is built upon several novel components:

Subtraction Unit (SU): Unlike traditional U-shaped architectures that rely heavily on element-wise addition or concatenation for feature fusion, the MSNet employs a subtraction-based approach. The SU calculates the difference between adjacent feature levels, enhancing complementary feature relations without incurring redundant information.
Multi-scale Feature Extraction: The authors construct a hierarchical pyramid of SUs that captures differential information across multiple scales, facilitating improved segmentation performance on varied polyp sizes and complexities.
LossNet: An auxiliary, training-free supervisory network incorporated to optimize the feature representation at each layer from detailed to broader structural cues. Instead of heuristic supervision signals, the LossNet applies a simple L2 loss, streamlining training while maintaining high segmentation accuracy.

Experimental Performance

The effectiveness of the MSNet is underscored by its superior performance across five benchmark datasets—each posing distinct challenges in polyp segmentation. In particular, the MSNet demonstrated marked improvements on the CVC-ColonDB and ETIS datasets, exhibiting enhancements of up to 14.1% in mean IoU and similar gains in other evaluation metrics such as mean Dice score, weighted F-measure, S-measure, and E-measure. Impressively, the model operates at real-time processing speeds (~70fps for 352x352 resolution images), making it a viable option for integration into clinical workflows.

Implications and Future Directions

On a practical level, the MSNet holds the potential to significantly reduce the workload of healthcare professionals by automating the polyp detection process during colonoscopy, thus aiding in reducing the incidence of colorectal cancer through timely interventions. From a theoretical perspective, the introduction of subtraction-based feature fusion presents a compelling avenue for further exploration in convolutional neural network designs, particularly in scenarios where cross-level feature complementarity is crucial.

Future research could focus on several areas:

Robustness Across Modalities: Extending MSNet to other imaging modalities might further establish its utility in medical image processing beyond colonoscopy.
Enhanced Supervision Techniques: Exploring more sophisticated training-free supervisory networks that leverage contrastive or adversarial learning could provide even richer feature representations.
Integration with Real-time Diagnostic Systems: Practical efforts towards embedding such networks into real-time diagnostic systems can provide end-to-end solutions for automated medical imaging analysis.

Overall, this work contributes a meaningful advancement in the field of medical image segmentation, offering a strategic framework to enhance detection efficacy across various applications, and setting a precedent for developing future state-of-the-art architectures.

Markdown