Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HarDNet-MSEG: A Simple Encoder-Decoder Polyp Segmentation Neural Network that Achieves over 0.9 Mean Dice and 86 FPS (2101.07172v2)

Published 18 Jan 2021 in cs.CV

Abstract: We propose a new convolution neural network called HarDNet-MSEG for polyp segmentation. It achieves SOTA in both accuracy and inference speed on five popular datasets. For Kvasir-SEG, HarDNet-MSEG delivers 0.904 mean Dice running at 86.7 FPS on a GeForce RTX 2080 Ti GPU. It consists of a backbone and a decoder. The backbone is a low memory traffic CNN called HarDNet68, which has been successfully applied to various CV tasks including image classification, object detection, multi-object tracking and semantic segmentation, etc. The decoder part is inspired by the Cascaded Partial Decoder, known for fast and accurate salient object detection. We have evaluated HarDNet-MSEG using those five popular datasets. The code and all experiment details are available at Github. https://github.com/james128333/HarDNet-MSEG

Citations (164)

Summary

  • The paper introduces HarDNet-MSEG, a novel encoder-decoder CNN achieving over 0.9 Mean Dice for polyp segmentation.
  • It leverages the efficient HarDNet68 backbone and selective convolutions to reduce inference time by around 30%.
  • Experiments demonstrate faster and more accurate segmentation compared to state-of-the-art models, enhancing clinical diagnostics.

Overview of HarDNet-MSEG: A Simple Encoder-Decoder Polyp Segmentation Neural Network

The paper presents "HarDNet-MSEG," an encoder-decoder convolutional neural network (CNN) developed for efficient and precise segmentation of colorectal polyps. The model achieves notable performance in terms of both accuracy and inference speed when evaluated across five widely recognized datasets: Kvasir-SEG, CVC-ColonDB, EndoScene, ETIS-Larib Polyp DB, and CVC-Clinic DB. With a recorded mean Dice coefficient exceeding 0.9 and operation at 86 FPS on a GeForce RTX 2080 Ti GPU, HarDNet-MSEG surpasses several prevailing state-of-the-art models, including U-Net[ResNet34] and PraNet.

Architectural Features

HarDNet-MSEG employs a backbone of HarDNet68, advantageous for its lower memory traffic design. HarDNet68 has proven effective across various computer vision tasks and enhances the model's computational efficiency by minimizing shortcut connections and employing 1x1 convolutions. This enables a reduction in inference time by approximately 30% compared to architectures like DenseNet and ResNet without compromising on accuracy. The decoder segment integrates concepts from the Cascaded Partial Decoder, focusing compute resources on deeper layer features and eschewing the use of shallow features. This approach facilitates improved feature map aggregation at differing scales through selective convolutions and skip connections.

Experiments and Comparative Analysis

The experimental setup involves diverse datasets with model evaluations indicating that HarDNet-MSEG outperforms existing solutions manifesting in noteworthy accuracies in mean Dice and mIoU across all tested scenarios. Particularly on the Kvasir-SEG dataset, the model reaches an unprecedented mean Dice score of 0.904 and maintains superior accuracy on other datasets such as CVC-ColonDB and ETIS. Comparative analysis with other state-of-the-art networks underscores HarDNet-MSEG’s efficiency, showing it operates significantly faster—nearly doubling the speed of PraNet, yet maintaining higher precision.

Key Results

  1. On the Kvasir-SEG dataset, HarDNet-MSEG achieves 0.912 mean Dice and 0.857 mIoU in one experimental setup and maintains superior performance compared to PraNet across multiple datasets, featuring a notably quicker inference speed.
  2. The model showcases enhanced boundary prediction accuracy, a vital element for effective polyp segmentation, providing significant utility for clinical applications in early cancer detection through colonoscopy.

Implications and Future Directions

The promising results demonstrate the potential of HarDNet-MSEG in medical imaging, particularly in aiding automated and real-time analysis of colonoscopy frames, which is crucial for reducing the colorectal cancer burden. The underlying architecture of HarDNet-MSEG, with its efficient use of resources and speed, exemplifies a progression towards more deployable neural networks in clinical environments. Future development could explore integration with attention mechanisms or further refinement of backbone efficiencies, continually enhancing segmentation precision and computational efficiency.

The research presents a compelling basis for further exploration of CNN-based segmentation in the field of medical diagnostics, particularly as tools like HarDNet-MSEG evolve towards operational readiness in practical applications. By enhancing the automated detection and analysis capabilities of medical imaging systems, there exists the potential for substantial positive implications in public health outcomes associated with colorectal cancer diagnosis and treatment.

Github Logo Streamline Icon: https://streamlinehq.com