RetSeg: Retention-based Colorectal Polyps Segmentation Network (2310.05446v5)
Abstract: Vision Transformers (ViTs) have revolutionized medical imaging analysis, showcasing superior efficacy compared to conventional Convolutional Neural Networks (CNNs) in vital tasks such as polyp classification, detection, and segmentation. Leveraging attention mechanisms to focus on specific image regions, ViTs exhibit contextual awareness in processing visual data, culminating in robust and precise predictions, even for intricate medical images. Moreover, the inherent self-attention mechanism in Transformers accommodates varying input sizes and resolutions, granting an unprecedented flexibility absent in traditional CNNs. However, Transformers grapple with challenges like excessive memory usage and limited training parallelism due to self-attention, rendering them impractical for real-time disease detection on resource-constrained devices. In this study, we address these hurdles by investigating the integration of the recently introduced retention mechanism into polyp segmentation, introducing RetSeg, an encoder-decoder network featuring multi-head retention blocks. Drawing inspiration from Retentive Networks (RetNet), RetSeg is designed to bridge the gap between precise polyp segmentation and resource utilization, particularly tailored for colonoscopy images. We train and validate RetSeg for polyp segmentation employing two publicly available datasets: Kvasir-SEG and CVC-ClinicDB. Additionally, we showcase RetSeg's promising performance across diverse public datasets, including CVC-ColonDB, ETIS-LaribPolypDB, CVC-300, and BKAI-IGH NeoPolyp. While our work represents an early-stage exploration, further in-depth studies are imperative to advance these promising findings.
- D. Khurana, A. Koli, K. Khatter, and S. Singh, “Natural language processing: state of the art, current trends and challenges,” Multimed Tools Appl, vol. 82, no. 3, pp. 3713–3744, 2023, doi: 10.1007/s11042-022-13428-4.
- T. Lin, Y. Wang, X. Liu, and X. Qiu, “A survey of transformers,” AI Open, vol. 3, no. September, pp. 111–132, 2022, doi: 10.1016/j.aiopen.2022.10.001.
- S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in Vision: A Survey,” ACM Comput Surv, vol. 54, no. 10, pp. 1–30, 2022, doi: 10.1145/3505244.
- S. Cuenat and R. Couturier, “Convolutional Neural Network (CNN) vs Vision Transformer (ViT) for Digital Holography,” 2022 2nd International Conference on Computer, Control and Robotics, ICCCR 2022, pp. 235–240, 2022, doi: 10.1109/ICCCR54399.2022.9790134.
- K. Al-hammuri, F. Gebali, A. Kanan, and I. T. Chelvan, “Vision transformer architecture and applications in digital health: a tutorial and survey,” Vis Comput Ind Biomed Art, vol. 6, no. 1, 2023, doi: 10.1186/s42492-023-00140-9.
- E. U. Henry, O. Emebob, and C. A. Omonhinmin, “Vision Transformers in Medical Imaging: A Review,” 2022, [Online]. Available: http://arxiv.org/abs/2211.10043
- A. He, K. Wang, T. Li, C. Du, S. Xia, and H. Fu, “H2Former: An Efficient Hierarchical Hybrid Transformer for Medical Image Segmentation,” IEEE Trans Med Imaging, vol. 42, no. 9, pp. 2763–2775, 2023, doi: 10.1109/TMI.2023.3264513.
- Q. Liu, C. Kaul, J. Wang, C. Anagnostopoulos, R. Murray-Smith, and F. Deligianni, “Optimizing Vision Transformers for Medical Image Segmentation,” vol. 104690, pp. 1–5, 2023, doi: 10.1109/icassp49357.2023.10096379.
- A. Chernyavskiy, D. Ilvovsky, and P. Nakov, “Transformers: ‘The End of History’ for NLP?,” 2021, [Online]. Available: http://arxiv.org/abs/2105.00813
- N. Patwardhan, S. Marrone, and C. Sansone, “Transformers in the Real World: A Survey on NLP Applications,” Information (Switzerland), vol. 14, no. 4, 2023, doi: 10.3390/info14040242.
- L. H. Mormille, C. Broni-Bediako, and M. Atsumi, “Regularizing self-attention on vision transformers with 2D spatial distance loss,” Artif Life Robot, vol. 27, no. 3, pp. 586–593, 2022, doi: 10.1007/s10015-022-00774-7.
- P. Mehrani and J. K. Tsotsos, “Self-attention in vision transformers performs perceptual grouping, not attention,” Front Comput Sci, vol. 5, pp. 1–30, 2023, doi: 10.3389/fcomp.2023.1178450.
- Q. Fan, H. Huang, M. Chen, H. Liu, and R. He, “RMT: Retentive Networks Meet Vision Transformers,” 2023, [Online]. Available: http://arxiv.org/abs/2309.11523
- O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation”, Accessed: Dec. 21, 2022. [Online]. Available: http://lmb.informatik.uni-freiburg.de/
- N. K. Tomar, A. Shergill, B. Rieders, U. Bagci, and D. Jha, “TransResU-Net: Transformer based ResU-Net for Real-Time Colonoscopy Polyp Segmentation,” pp. 1–4, 2022, [Online]. Available: http://arxiv.org/abs/2206.08985
- D. Jha, P. H. Smedsrud, and M. A. Riegler, “Kvasir-SEG: A Segmented Polyp Dataset,” vol. 2, pp. 451–462, doi: 10.1007/978-3-030-37734-2.
- J. Bernal, J. Sánchez, and F. Vilariño, “Towards automatic polyp detection with a polyp appearance model,” in Pattern Recognition, Sep. 2012, pp. 3166–3182. doi: 10.1016/j.patcog.2012.03.002.
- P. M. Colucci, S. H. Yale, and C. J. Rall, “Colorectal polyps.,” Clin Med Res, vol. 1, no. 3, pp. 261–262, 2003, doi: 10.3121/cmr.1.3.261.
- Y. Hao, Y. Wang, M. Qi, X. He, Y. Zhu, and J. Hong, “Risk factors for recurrent colorectal polyps,” Gut Liver, vol. 14, no. 4, pp. 399–411, 2020, doi: 10.5009/gnl19097.
- M. Øines, L. M. Helsingen, M. Bretthauer, and L. Emilsson, “Epidemiology and risk factors of colorectal polyps,” Best Pract Res Clin Gastroenterol, vol. 31, no. 4, pp. 419–424, 2017, doi: 10.1016/j.bpg.2017.06.004.
- T. K. L. Lui and W. K. Leung, “Is artificial intelligence the final answer to missed polyps in colonoscopy?,” World J Gastroenterol, vol. 26, no. 35, pp. 5248–5255, 2020, doi: 10.3748/WJG.V26.I35.5248.
- S. B. Ahn, D. S. Han, J. H. Bae, T. J. Byun, J. P. Kim, and C. S. Eun, “The miss rate for colorectal adenoma determined by quality-adjusted, back-to-back colonoscopies,” Gut Liver, vol. 6, no. 1, pp. 64–70, 2012, doi: 10.5009/gnl.2012.6.1.64.
- L. F. Sánchez-Peralta, L. Bote-Curiel, A. Picón, F. M. Sánchez-Margallo, and J. B. Pagador, “Deep learning to find colorectal polyps in colonoscopy: A systematic literature review,” Artif Intell Med, vol. 108, no. March, p. 101923, 2020, doi: 10.1016/j.artmed.2020.101923.
- J. Ribeiro, S. Nóbrega, and A. Cunha, “Polyps Detection in Colonoscopies,” Procedia Comput Sci, vol. 196, pp. 477–484, 2021, doi: 10.1016/j.procs.2021.12.039.
- N. Rani, R. Verma, and A. Jindal, “Polyp Detection Using Deep Neural Networks,” Handbook of Intelligent Computing and Optimization for Sustainable Development, pp. 801–814, Feb. 2022, doi:10.1002/9781119792642.CH37.
- F. Tang, Q. Huang, J. Wang, X. Hou, J. Su, and J. Liu, “DuAT: Dual-Aggregation Transformer Network for Medical Image Segmentation,” Dec. 2022, doi: 10.48550/arxiv.2212.11677.
- J. Lewis, Y. J. Cha, and J. Kim, “Dual encoder–decoder-based deep polyp segmentation network for colonoscopy images,” Sci Rep, vol. 13, no. 1, pp. 1–12, 2023, doi: 10.1038/s41598-023-28530-2.
- A. K. Mohammed, S. Yildirim-Yayilgan, I. Farup, M. Pedersen, and O. Hovde, “Y-Net: A deep Convolutional Neural Network to Polyp Detection,” British Machine Vision Conference 2018, BMVC 2018, pp. 1–11, 2019.
- P. Song, J. Li, and H. Fan, “Attention based multi-scale parallel network for polyp segmentation,” Comput Biol Med, vol. 146, no. March, p. 105476, 2022, doi: 10.1016/j.compbiomed.2022.105476.
- Y. Qin, H. Xia, and S. Song, “RT-Net: Region-Enhanced Attention Transformer Network for Polyp Segmentation,” Neural Process Lett, 2023, doi: 10.1007/s11063-023-11405-y.
- H. Wu, Z. Zhao, and Z. Wang, “META-Unet: Multi-Scale Efficient Transformer Attention Unet for Fast and High-Accuracy Polyp Segmentation,” IEEE Transactions on Automation Science and Engineering, vol. PP, pp. 1–12, 2023, doi: 10.1109/TASE.2023.3292373.
- K. Wang, Z. Qian, W. Zhang, M. Zhang, and Q. Luo, “A Novel Neural Network Based on Transformer for Polyp Image Segmentation,” 2023 IEEE 3rd International Conference on Electronic Technology, Communication and Information, ICETCI 2023, pp. 413–417, 2023, doi: 10.1109/ICETCI57876.2023.10176365.
- Y. Wang, Z. Deng, S. Member, S. Hu, and S. Wang, “Cooperation Learning Enhanced Colonic Polyp Segmentation Based on Transformer- CNN Fusion”.
- F. Chen, H. Ma, and W. Zhang, “SegT: Separated edge-guidance transformer network for polyp segmentation,” Mathematical Biosciences and Engineering, vol. 20, no. 10, pp. 17803–17821, 2023, doi: 10.3934/mbe.2023791.
- Q. Chang, D. Ahmad, J. Toth, R. Bascom, and W. E. Higgins, “ESFPNet: efficient deep learning architecture for real-time lesion segmentation in autofluorescence bronchoscopic video,” 2022, Accessed: Feb. 02, 2023. [Online]. Available: http://mipl.ee.psu.edu/
- R. G. Dumitru, D. Peteleaza, and C. Craciun, “Using DUCK-Net for polyp image segmentation,” Sci Rep, vol. 13, no. 1, pp. 1–12, 2023, doi: 10.1038/s41598-023-36940-5.
- G. Liu, S. Yao, D. Liu, B. Chang, Z. Chen, and J. Wang, “CAFE-Net: Cross-Attention and Feature Exploration Network for polyp segmentation,” vol. 238, no. September 2023, 2024.
- W. Song and H. Yu, “Non-pooling Network for medical image segmentation,” 2023, [Online]. Available: http://arxiv.org/abs/2302.10412
- G. Liu, Y. Jiang, D. Liu, B. Chang, L. Ru, and M. Li, “A coarse-to-fine segmentation frame for polyp segmentation via deep and classification features,” Expert Syst Appl, vol. 214, no. May 2022, p. 118975, 2023, doi: 10.1016/j.eswa.2022.118975.
- B. Dong, W. Wang, D.-P. Fan, J. Li, H. Fu, and L. Shao, “Polyp-PVT: Polyp Segmentation with Pyramid Vision Transformers,” 2021, [Online]. Available: http://arxiv.org/abs/2108.06932
- F. Liu, Z. Hua, J. Li, and L. Fan, “MFBGR: Multi-scale feature boundary graph reasoning network for polyp segmentation,” Eng Appl Artif Intell, vol. 123, no. February, p. 106213, 2023, doi: 10.1016/j.engappai.2023.106213.
- H. Al Jowair, M. Alsulaiman, and G. Muhammad, “Multi parallel U-net encoder network for effective polyp image segmentation,” Image Vis Comput, vol. 137, no. May, p. 104767, 2023, doi: 10.1016/j.imavis.2023.104767.
- Y. Li, M. Hu, and X. Yang, “Polyp-SAM: Transfer SAM for Polyp Segmentation,” 2023, [Online]. Available: http://arxiv.org/abs/2305.00293
- M. J. Alam and S. A. Fattah, “SR-AttNet: An Interpretable Stretch–Relax Attention based Deep Neural Network for Polyp Segmentation in Colonoscopy Images,” Comput Biol Med, vol. 160, no. September 2022, p. 106945, 2023, doi: 10.1016/j.compbiomed.2023.106945.
- Q.-H. Trinh, “Meta-Polyp: a baseline for efficient Polyp segmentation,” 2023, [Online]. Available: http://arxiv.org/abs/2305.07848
- M. Naderi, M. Givkashi, F. Piri, N. Karimi, and S. Samavi, “Focal-UNet: UNet-like Focal Modulation for Medical Image Segmentation,” pp. 1–8, 2022, [Online]. Available: https://arxiv.org/abs/2212.09263v1
- N. H. Thuan, N. T. Oanh, N. T. Thuy, S. Perry, and D. V. Sang, “RaBiT: An Efficient Transformer using Bidirectional Feature Pyramid Network with Reverse Attention for Colon Polyp Segmentation,” pp. 1–15, 2023.
- N. K. Tomar, D. Jha, U. Bagci, and S. Ali, “TGANet: Text-Guided Attention for Improved Polyp Segmentation,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2022, pp. 151–160. doi: 10.1007/978-3-031-16437-8
- T. Yu and Q. Wu, “HarDNet-CPS: Colorectal polyp segmentation based on Harmonic Densely United Network,” Biomed Signal Process Control, vol. 85, no. June 2022, p. 104953, 2023, doi: 10.1016/j.bspc.2023.104953.
- R. Karmakar and S. Nooshabadi, “Mobile-PolypNet: Lightweight Colon Polyp Segmentation Network for Low-Resource Settings,” J Imaging, vol. 8, no. 6, 2022, doi: 10.3390/jimaging8060169.
- T. P. Van, L. B. Doan, T. T. Nguyen, D. T. Tran, Q. Van Nguyen, and D. V. Sang, “Online pseudo labeling for polyp segmentation with momentum networks,” Proceedings - International Conference on Knowledge and Systems Engineering, KSE, vol. 2022-Octob, pp. 1–6, 2022, doi: 10.1109/KSE56063.2022.9953785.
- J. Silva, A. Histace, O. Romain, X. Dray, and B. Granado, “Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer,” Int J Comput Assist Radiol Surg, vol. 9, no. 2, pp. 283–293, 2014, doi: 10.1007/s11548-013-0926-3.
- J. Bernal, F. J. Sánchez, G. Fernández-Esparrach, D. Gil, C. Rodríguez, and F. Vilariño, “WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians,” Computerized Medical Imaging and Graphics, vol. 43, pp. 99–111, Jul. 2015, doi: 10.1016/J.COMPMEDIMAG.2015.02.007.