Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation (2403.17701v4)

Published 26 Mar 2024 in eess.IV, cs.CV, and cs.LG

Abstract: Image segmentation holds a vital position in the realms of diagnosis and treatment within the medical domain. Traditional convolutional neural networks (CNNs) and Transformer models have made significant advancements in this realm, but they still encounter challenges because of limited receptive field or high computing complexity. Recently, State Space Models (SSMs), particularly Mamba and its variants, have demonstrated notable performance in the field of vision. However, their feature extraction methods may not be sufficiently effective and retain some redundant structures, leaving room for parameter reduction. Motivated by previous spatial and channel attention methods, we propose Triplet Mamba-UNet. The method leverages residual VSS Blocks to extract intensive contextual features, while Triplet SSM is employed to fuse features across spatial and channel dimensions. We conducted experiments on ISIC17, ISIC18, CVC-300, CVC-ClinicDB, Kvasir-SEG, CVC-ColonDB, and Kvasir-Instrument datasets, demonstrating the superior segmentation performance of our proposed TM-UNet. Additionally, compared to the previous VM-UNet, our model achieves a one-third reduction in parameters.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. J.-Z. Cheng, D. Ni, Y.-H. Chou, J. Qin, C.-M. Tiu, Y.-C. Chang, C.-S. Huang, D. Shen, and C.-M. Chen, “Computer-aided diagnosis with deep learning architecture: Applications to breast lesions in us images and pulmonary nodules in ct scans,” Scientific Reports, vol. 6, 2016.
  2. Cambridge, MA, USA: MIT Press, 1998.
  3. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 (N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi, eds.), (Cham), pp. 234–241, Springer International Publishing, 2015.
  4. A. Vaswani, N. M. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Neural Information Processing Systems, 2017.
  5. X. Xiao, S. Lian, Z. Luo, and S. Li, “Weighted res-unet for high-quality retina vessel segmentation,” in 2018 9th International Conference on Information Technology in Medicine and Education (ITME), pp. 327–331, 2018.
  6. S. Guan, A. A. Khan, S. Sikdar, and P. V. Chitnis, “Fully dense unet for 2-d sparse photoacoustic tomography artifact removal,” IEEE Journal of Biomedical and Health Informatics, vol. 24, pp. 568–576, 2018.
  7. N. Ibtehaz and M. S. Rahman, “MultiResUNet : Rethinking the U-Net architecture for multimodal biomedical image segmentation,” Neural Netw, vol. 121, pp. 74–87, Jan 2020.
  8. J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou, “Transunet: Transformers make strong encoders for medical image segmentation,” ArXiv, vol. abs/2102.04306, 2021.
  9. J. M. J. Valanarasu, P. Oza, I. Hacihaliloglu, and V. M. Patel, “Medical transformer: Gated axial-attention for medical image segmentation,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2021 (M. de Bruijne, P. C. Cattin, S. Cotin, N. Padoy, S. Speidel, Y. Zheng, and C. Essert, eds.), (Cham), pp. 36–46, Springer International Publishing, 2021.
  10. H. Cao, Y. Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-unet: Unet-like pure transformer for medical image segmentation,” in Computer Vision – ECCV 2022 Workshops (L. Karlinsky, T. Michaeli, and K. Nishino, eds.), (Cham), pp. 205–218, Springer Nature Switzerland, 2023.
  11. A. Hatamizadeh, Y. Tang, V. Nath, D. Yang, A. Myronenko, B. Landman, H. R. Roth, and D. Xu, “Unetr: Transformers for 3d medical image segmentation,” in 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), (Los Alamitos, CA, USA), pp. 1748–1758, IEEE Computer Society, jan 2022.
  12. A. Lin, B. Chen, J. Xu, Z. Zhang, G. Lu, and D. Zhang, “Ds-transunet: Dual swin transformer u-net for medical image segmentation,” IEEE Transactions on Instrumentation and Measurement, vol. 71, pp. 1–15, 2022.
  13. H.-Y. Zhou, J. Guo, Y. Zhang, X. Han, L. Yu, L. Wang, and Y. Yu, “nnformer: Volumetric medical image segmentation via a 3d transformer,” IEEE Transactions on Image Processing, vol. 32, pp. 4036–4045, 2023.
  14. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations, 2021.
  15. A. Gu, K. Goel, and C. R’e, “Efficiently modeling long sequences with structured state spaces,” ArXiv, vol. abs/2111.00396, 2021.
  16. A. Gu and T. Dao, “Mamba: Linear-time sequence modeling with selective state spaces,” 2024.
  17. L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision mamba: Efficient visual representation learning with bidirectional state space model,” ArXiv, vol. abs/2401.09417, 2024.
  18. Y. Liu, Y. Tian, Y. Zhao, H. Yu, L. Xie, Y. Wang, Q. Ye, and Y. Liu, “Vmamba: Visual state space model,” ArXiv, vol. abs/2401.10166, 2024.
  19. Z. Wang, J.-Q. Zheng, Y. Zhang, G. Cui, and L. Li, “Mamba-unet: Unet-like pure visual mamba for medical image segmentation,” ArXiv, vol. abs/2402.05079, 2024.
  20. J. Ruan and S. Xiang, “Vm-unet: Vision mamba unet for medical image segmentation,” ArXiv, vol. abs/2402.02491, 2024.
  21. M. Zhang, Y. Yu, L. Gu, T. Lin, and X. Tao, “Vm-unet-v2 rethinking vision mamba unet for medical image segmentation,” arXiv preprint arXiv:2403.09157, 2024.
  22. S. Woo, J. Park, J.-Y. Lee, and I.-S. Kweon, “Cbam: Convolutional block attention module,” ArXiv, vol. abs/1807.06521, 2018.
  23. J. Fu, J. Liu, H. Tian, Z. Fang, and H. Lu, “Dual attention network for scene segmentation,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3141–3149, 2018.
  24. Q. Hou, D. Zhou, and J. Feng, “Coordinate attention for efficient mobile network design,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13708–13717, 2021.
  25. J. Park, S. Woo, J.-Y. Lee, and I.-S. Kweon, “Bam: Bottleneck attention module,” ArXiv, vol. abs/1807.06514, 2018.
  26. D. Misra, T. Nalamada, A. U. Arasanipalai, and Q. Hou, “Rotate to attend: Convolutional triplet attention module,” in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3138–3147, 2021.
  27. F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807, 2016.
  28. N. C. F. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler, and A. Halpern, “Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic),” in 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 168–172, 2018.
  29. N. C. F. Codella, V. M. Rotemberg, P. Tschandl, M. E. Celebi, S. W. Dusza, D. Gutman, B. Helba, A. Kalloo, K. Liopyris, M. A. Marchetti, H. Kittler, and A. C. Halpern, “Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic),” ArXiv, vol. abs/1902.03368, 2019.
  30. I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” in International Conference on Learning Representations, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Hao Tang (379 papers)
  2. Lianglun Cheng (4 papers)
  3. Guoheng Huang (12 papers)
  4. Zhengguang Tan (2 papers)
  5. Junhao Lu (7 papers)
  6. Kaihong Wu (1 paper)
Citations (9)

Summary

We haven't generated a summary for this paper yet.