BetterNet: An Efficient CNN Architecture with Residual Learning and Attention for Precision Polyp Segmentation (2405.04288v1)
Abstract: Colorectal cancer contributes significantly to cancer-related mortality. Timely identification and elimination of polyps through colonoscopy screening is crucial in order to decrease mortality rates. Accurately detecting polyps in colonoscopy images is difficult because of the differences in characteristics such as size, shape, texture, and similarity to surrounding tissues. Current deep-learning methods often face difficulties in capturing long-range connections necessary for segmentation. This research presents BetterNet, a convolutional neural network (CNN) architecture that combines residual learning and attention methods to enhance the accuracy of polyp segmentation. The primary characteristics encompass (1) a residual decoder architecture that facilitates efficient gradient propagation and integration of multiscale features. (2) channel and spatial attention blocks within the decoder block to concentrate the learning process on the relevant areas of polyp regions. (3) Achieving state-of-the-art performance on polyp segmentation benchmarks while still ensuring computational efficiency. (4) Thorough ablation tests have been conducted to confirm the influence of architectural components. (5) The model code has been made available as open-source for further contribution. Extensive evaluations conducted on datasets such as Kvasir-SEG, CVC ClinicDB, Endoscene, EndoTect, and Kvasir-Sessile demonstrate that BetterNets outperforms current SOTA models in terms of segmentation accuracy by significant margins. The lightweight design enables real-time inference for various applications. BetterNet shows promise in integrating computer-assisted diagnosis techniques to enhance the detection of polyps and the early recognition of cancer. Link to the code: https://github.com/itsOwen/BetterNet
- Tensorflow: a system for large-scale machine learning. Operating Systems Design and Implementation , 265–283.
- Polyp segmentation in colonoscopy images using fully convolutional network, in: 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE. pp. 69–72.
- Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics 43, 99–111.
- Fully convolutional neural networks for polyp segmentation in colonoscopy. Proceedings of SPIE doi:10.1117/12.2254361.
- Keras. https://github.com/fchollet/keras. Accessed: April 12, 2024.
- Polyp-pvt: Polyp segmentation with pyramid vision transformers. arXiv preprint arXiv:arXiv:2108.06932.
- Structure-measure: A new way to evaluate foreground maps, in: Proceedings of the IEEE international conference on computer vision, pp. 4548--4557.
- Enhanced-alignment measure for binary foreground map evaluation. arXiv preprint arXiv:arXiv:1805.10421.
- Cognitive vision inspired object segmentation metric and loss function. SCIENTIA SINICA Informationis 51, 1228--1244.
- Pranet: Parallel reverse attention network for polyp segmentation, in: International conference on medical image computing and computer-assisted intervention, Springer. pp. 263--273.
- Texture-less surface reconstruction using shape-based image augmentation. Computers in biology and medicine 150, 106114.
- Worldwide burden of colorectal cancer: A review. Updates in Surgery 68, 7--11.
- Computer-aided shape features extraction and regression models for predicting the ascending aortic aneurysm growth rate. Computers in Biology and Medicine 162, 107052.
- Deep residual learning for image recognition. CoRR abs/1512.03385. URL: http://arxiv.org/abs/1512.03385, arXiv:1512.03385.
- The endotect 2020 challenge: Evaluation and comparison of classification, segmentation and inference time for endoscopy. Lecture Notes in Computer Science , 263--274.
- Polyp detection in colonoscopy video using elliptical shape feature, in: 2007 IEEE International Conference on Image Processing, San Antonio, TX, USA. pp. II--465--II--468. doi:10.1109/ICIP.2007.4379193.
- A comparative study of texture features for the discrimination of gastric polyps in endoscopic video, in: 18th IEEE Symposium on Computer-Based Medical Systems (CBMS’05), Dublin, Ireland. pp. 575--580. doi:10.1109/CBMS.2005.6.
- Batch normalization: Accelerating deep network training by reducing internal covariate shift, in: International conference on machine learning, PMLR. pp. 448--456.
- Age-specific diagnostic classification of asd using deep learning approaches, in: Telehealth Ecosystems in Practice: Proceedings of the EFMI Special Topic Conference 2023, IOS Press. p. 267.
- A comprehensive study on colorectal polyp segmentation with resunet++, conditional random field and test-time augmentation. IEEE Journal of Biomedical and Health Informatics 25, 2029--2040.
- Kvasir-seg: A segmented polyp dataset, in: MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II, Springer. pp. 451--462.
- Computer-aided tumor detection in endoscopic video using color wavelet features. IEEE Transactions on Information Technology in Biomedicine 7, 141--152.
- Miss rate of colorectal neoplastic polyps and risk factors for missed polyps in consecutive colonoscopies. Intestinal Research 15, 411--418.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 .
- A novel multi-task learning network for skin lesion classification based on multi-modal clues and label-level fusion. Computers in Biology and Medicine , 108549.
- Ftmf-net: A fourier transformmultiscale feature fusion network for segmentation of small polyp objects. IEEE Transactions on Instrumentation and Measurement .
- Cafe-net: Cross-attention and feature exploration network for polyp segmentation. Expert Systems with Applications 238, 121754--121754.
- Caranet: Context-aware residual attention network for polyp segmentation, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1478--1487.
- V-net: Fully convolutional neural networks for volumetric medical image segmentation, in: 2016 Fourth international conference on 3D vision, IEEE. pp. 565--571.
- Pvt-cascade: Pyramidal vision transformer cascade for polyp segmentation. arXiv preprint arXiv:2302.07486 .
- U-net: Convolutional networks for biomedical image segmentation, in: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th International conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III, Springer. pp. 234--241.
- Multi-planar 3d knee mri segmentation via unet inspired architectures. International Journal of Imaging Systems and Technology 33, 985--998.
- Moving object detection based on frame difference and w4. Signal, Image and Video Processing 11, 1357--1364.
- Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15, 1929--1958.
- Global cancer statistics 2020: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians 71, 209--249.
- Efficientnet: Rethinking model scaling for convolutional neural networks. CoRR abs/1905.11946. URL: http://arxiv.org/abs/1905.11946, arXiv:1905.11946.
- Attention is all you need. CoRR abs/1706.03762.
- A benchmark for endoluminal scene segmentation of colonoscopy images. Journal of Healthcare Engineering 2017, 1--9. doi:10.1155/2017/4037190.
- Pvt v2: Improved baselines with pyramid vision transformer. Computational Visual Media 8, 415--424.
- Focus u-net: A novel dual attention-gated cnn for polyp segmentation during colonoscopy. Comput Biol Med 137, 104815. doi:10.1016/j.compbiomed.2021.104815.
- Dual-branch multi-information aggregation network with transformer and convolution for polyp segmentation. Computers in Biology and Medicine 168, 107760.
- Hsnet: Hybrid segmentation network for polyp segmentation. arXiv preprint arXiv:2203.14915 .
- Acsnet: Attention compact spatial convolutional network for efficient dense prediction. arXiv preprint arXiv:2002.04156 .
- Unet++: A nested u-net architecture for medical image segmentation. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support , 3--11.