Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining (2402.03302v2)

Published 5 Feb 2024 in cs.CV, cs.LG, and eess.IV

Abstract: Accurate medical image segmentation demands the integration of multi-scale information, spanning from local features to global dependencies. However, it is challenging for existing methods to model long-range global information, where convolutional neural networks (CNNs) are constrained by their local receptive fields, and vision transformers (ViTs) suffer from high quadratic complexity of their attention mechanism. Recently, Mamba-based models have gained great attention for their impressive ability in long sequence modeling. Several studies have demonstrated that these models can outperform popular vision models in various tasks, offering higher accuracy, lower memory consumption, and less computational burden. However, existing Mamba-based models are mostly trained from scratch and do not explore the power of pretraining, which has been proven to be quite effective for data-efficient medical image analysis. This paper introduces a novel Mamba-based model, Swin-UMamba, designed specifically for medical image segmentation tasks, leveraging the advantages of ImageNet-based pretraining. Our experimental results reveal the vital role of ImageNet-based training in enhancing the performance of Mamba-based models. Swin-UMamba demonstrates superior performance with a large margin compared to CNNs, ViTs, and latest Mamba-based models. Notably, on AbdomenMRI, Encoscopy, and Microscopy datasets, Swin-UMamba outperforms its closest counterpart U-Mamba_Enc by an average score of 2.72%.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. A population-based phenome-wide association study of cardiac and aortic structure and function. Nature medicine, 26(10):1654–1662, 2020.
  2. Artificial intelligence–enabled rapid diagnosis of patients with covid-19. Nature medicine, 26(8):1224–1228, 2020.
  3. Automatic pulmonary lobe segmentation using deep learning. In 2019 IEEE 16th international symposium on biomedical imaging (ISBI 2019), pages 1225–1228. IEEE, 2019.
  4. On the effect of inter-observer variability for a reliable estimation of uncertainty of medical image segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part I, pages 682–690. Springer, 2018.
  5. Inter-observer variability of manual contour delineation of structures in ct. European radiology, 29:1391–1399, 2019.
  6. Clinically applicable deep learning framework for organs at risk delineation in ct images. Nature Machine Intelligence, 1(10):480–491, 2019.
  7. Fully convolutional multi-scale residual densenets for cardiac segmentation and automated cardiac diagnosis using ensemble of classifiers. Medical image analysis, 51:21–45, 2019.
  8. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  9. nnformer: Volumetric medical image segmentation via a 3d transformer. IEEE Transactions on Image Processing, 32:4036–4045, 2023.
  10. UNet-2022: Exploring dynamics in non-isomorphic architecture. In Ruidan Su, Yudong Zhang, Han Liu, and Alejandro F Frangi, editors, Medical Imaging and Computer-Aided Diagnosis, Lecture Notes in Electrical Engineering, pages 465–476. Springer Nature.
  11. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2):203–211, 2021.
  12. Multi-scale self-guided attention for medical image segmentation. IEEE Journal of Biomedical and Health Informatics, 25(1):121–130, 2021.
  13. Andriy Myronenko. 3d mri brain tumor segmentation using autoencoder regularization. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 4th International Workshop, BrainLes 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers, Part II 4, pages 311–320. Springer, 2019.
  14. Understanding the effective receptive field in deep convolutional neural networks. Advances in neural information processing systems, 29, 2016.
  15. Do vision transformers see like convolutional neural networks? Advances in Neural Information Processing Systems, 34:12116–12128, 2021.
  16. Global context vision transformers. In International Conference on Machine Learning, pages 12633–12646. PMLR, 2023.
  17. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752, 2023.
  18. Neuropathologist-level integrated classification of adult-type diffuse gliomas using deep learning from whole-slide pathological images. Nature Communications, 14(1):6359, 2023.
  19. A computational atlas of the hippocampal formation using ex vivo, ultra-high resolution mri: Application to adaptive segmentation of in vivo mri. NeuroImage, 115:117–137, 2015.
  20. A survey of transformers. AI Open, 2022.
  21. Efficiently modeling long sequences with structured state spaces. In International Conference on Learning Representations, 2021.
  22. Combining recurrent, convolutional, and continuous-time models with linear state space layers. Advances in neural information processing systems, 34:572–585, 2021.
  23. U-mamba: Enhancing long-range dependency for biomedical image segmentation. arXiv preprint arXiv:2401.04722, 2024.
  24. Vmamba: Visual state space model. arXiv preprint arXiv:2401.10166, 2024.
  25. Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation. arXiv preprint arXiv:2401.13560, 2024.
  26. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv preprint arXiv:2401.09417, 2024.
  27. Mambamorph: a mamba-based backbone with contrastive feature learning for deformable mr-ct registration. arXiv preprint arXiv:2401.13934, 2024.
  28. A survey on vision transformer. IEEE transactions on pattern analysis and machine intelligence, 45(1):87–110, 2022.
  29. Expert-level detection of pathologies from unannotated chest x-ray images via self-supervised learning. Nature Biomedical Engineering, 6(12):1399–1406, 2022.
  30. Annotation-efficient deep learning for automatic medical image segmentation. Nature communications, 12(1):5915, 2021.
  31. D-unet: A dimension-fusion u shape network for chronic stroke lesion segmentation. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 18(3):940–950, 2021.
  32. Aunet: attention-guided dense-upsampling networks for breast mass segmentation in whole mammograms. Physics in Medicine & Biology, 65(5):055005, feb 2020.
  33. Deeply-supervised nets. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, pages 562–570. PMLR. ISSN: 1938-7228.
  34. Swin-unet: Unet-like pure transformer for medical image segmentation. In Leonid Karlinsky, Tomer Michaeli, and Ko Nishino, editors, Computer Vision – ECCV 2022 Workshops, Lecture Notes in Computer Science, pages 205–218. Springer Nature Switzerland.
  35. Unleashing the strengths of unlabeled data in pan-cancer abdominal organ quantification: the flare22 challenge. arXiv preprint arXiv:2308.05862, 2023.
  36. 2017 robotic instrument segmentation challenge. arXiv preprint arXiv:1902.06426, 2019.
  37. The multi-modality cell segmentation challenge: towards universal solutions. arXiv preprint arXiv:2308.05864, 2023.
  38. Unetr: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 574–584, 2022.
  39. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. In International MICCAI Brainlesion Workshop, pages 272–284. Springer, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Jiarun Liu (17 papers)
  2. Hao Yang (328 papers)
  3. Hong-Yu Zhou (50 papers)
  4. Yan Xi (13 papers)
  5. Lequan Yu (89 papers)
  6. Yizhou Yu (148 papers)
  7. Yong Liang (32 papers)
  8. Guangming Shi (87 papers)
  9. Shaoting Zhang (133 papers)
  10. Hairong Zheng (71 papers)
  11. Shanshan Wang (166 papers)
Citations (89)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com