LHU-Net: A Light Hybrid U-Net for Cost-Efficient, High-Performance Volumetric Medical Image Segmentation (2404.05102v2)
Abstract: The rise of Transformer architectures has revolutionized medical image segmentation, leading to hybrid models that combine Convolutional Neural Networks (CNNs) and Transformers for enhanced accuracy. However, these models often suffer from increased complexity and overlook the interplay between spatial and channel features, which is vital for segmentation precision. We introduce LHU-Net, a streamlined Hybrid U-Net for volumetric medical image segmentation, designed to first analyze spatial and then channel features for effective feature extraction. Tested on five benchmark datasets (Synapse, LA, Pancreas, ACDC, BRaTS 2018), LHU-Net demonstrated superior efficiency and accuracy, notably achieving a 92.66 Dice score on ACDC with 85\% fewer parameters and a quarter of the computational demand compared to leading models. This performance, achieved without pre-training, extra data, or model ensembles, sets new benchmarks for computational efficiency and accuracy in segmentation, using under 11 million parameters. This achievement highlights that balancing computational efficiency with high accuracy in medical image segmentation is feasible. Our implementation of LHU-Net is freely accessible to the research community on GitHub (https://github.com/xmindflow/LHUNet).
- Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015.
- U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
- Semantic image segmentation with deep convolutional nets and fully connected crfs. In International Conference on Learning Representations, 2015.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 2016 fourth international conference on 3D vision (3DV), pages 565–571. Ieee, 2016.
- Medical image segmentation review: The success of u-net. arXiv preprint arXiv:2211.14830, 2022.
- Dermosegdiff: A boundary-aware segmentation diffusion model for skin lesion delineation. In International Workshop on PRedictive Intelligence In MEdicine, pages 146–158. Springer, 2023.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Transformers in medical imaging: A survey. Medical Image Analysis, page 102802, 2023.
- Advances in medical image analysis with vision transformers: A comprehensive review. arXiv preprint arXiv:2301.03505, 2023a.
- Laplacian-former: Overcoming the limitations of vision transformers in local texture detection. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 736–746. Springer, 2023b.
- Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231, 2018.
- Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306, 2021.
- Hiformer: Hierarchical multi-scale representations using transformers for medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 6202–6212, 2023.
- Unetr: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 574–584, 2022.
- Cotr: Efficiently bridging cnn and transformer for 3d medical image segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part III 24, pages 171–180. Springer, 2021.
- Transbts: Multimodal brain tumor segmentation using transformer. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, pages 109–119. Springer, 2021.
- Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision, pages 205–218. Springer, 2022.
- nnformer: Interleaved transformer for volumetric segmentation. arXiv preprint arXiv:2109.03201, 2021.
- Hca-former: Hybrid convolution attention transformer for 3d medical image segmentation. Biomedical Signal Processing and Control, 90:105834, 2024.
- Collaborative networks of transformers and convolutional neural networks are powerful and versatile learners for accurate 3d medical image segmentation. Computers in Biology and Medicine, 164:107228, 2023.
- Unetr++: delving into efficient and accurate 3d medical image segmentation. arXiv preprint arXiv:2212.04497, 2022.
- Visual attention network. Computational Visual Media, 9(4):733–752, 2023.
- Beyond self-attention: Deformable large kernel attention for medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1287–1297, 2024.
- Mist: Medical image segmentation transformer with convolutional attention mixing (cam) decoder. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 404–413, 2024.
- Scanext: Enhancing 3d medical image segmentation with dual attention network and depth-wise convolution. Heliyon, 2024.
- Cross-convolutional transformer for automated multi-organs segmentation in a variety of medical images. Physics in Medicine & Biology, 68(3):035008, 2023a.
- Dual encoder network with transformer-cnn for multi-organ segmentation. Medical & Biological Engineering & Computing, 61(3):661–671, 2023.
- Cpftransformer: transformer fusion context pyramid medical image segmentation network. Frontiers in Neuroscience, 17:1288366, 2023.
- Miccai multi-atlas labeling beyond the cranial vault–workshop and challenge. In Proc. MICCAI Multi-Atlas Labeling Beyond Cranial Vault—Workshop Challenge, volume 5, page 12, 2015.
- nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2):203–211, 2021.
- Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE transactions on medical imaging, 37(11):2514–2525, 2018.
- A global benchmark of algorithms for segmenting the left atrium from late gadolinium-enhanced cardiac magnetic resonance imaging. Medical image analysis, 67:101832, 2021.
- Semi-supervised medical image segmentation through dual-task consistency. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 8801–8809, 2021.
- Mcf: Mutual correction framework for semi-supervised medical image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15651–15660, 2023b.
- Monai: An open-source framework for deep learning in healthcare. arXiv preprint arXiv:2211.02701, 2022.
- Deeporgan: Multi-level deep convolutional networks for automated pancreas segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part I 18, pages 556–564. Springer, 2015.
- Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features. Scientific data, 4(1):1–13, 2017.
- The multimodal brain tumor image segmentation benchmark (brats). IEEE transactions on medical imaging, 34(10):1993–2024, 2014.
- Hetero-modal variational encoder-decoder for joint modality completion and segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part II 22, pages 74–82. Springer, 2019.
- Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
- William Falcon and The PyTorch Lightning team. PyTorch Lightning, March 2019. URL https://github.com/Lightning-AI/lightning.
- Metrics for evaluating 3d medical image segmentation: analysis, selection, and tool. BMC medical imaging, 15:1–28, 2015.
- Sstrans-net: Smart swin transformer network for medical image segmentation. Biomedical Signal Processing and Control, 91:106071, 2024.
- Unlocking fine-grained details with wavelet-based high-frequency enhancement in transformers. In International Workshop on Machine Learning in Medical Imaging, pages 207–216. Springer, 2023c.
- Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. In International MICCAI Brainlesion Workshop, pages 272–284. Springer, 2021.
- Moaformer: Aggregating adjacent window features into local vision transformer using overlapped attention mechanism for volumetric medical segmentation. In Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, pages 121–127, 2022.
- Ms-tcnet: An effective transformer–cnn combined network using multi-scale feature learning for 3d medical image segmentation. Computers in Biology and Medicine, page 108057, 2024.
- Mosformer: Momentum encoder-based inter-slice fusion transformer for medical image segmentation. arXiv preprint arXiv:2401.11856, 2024.
- 3D Slicer as an Image Computing Platform for the Quantitative Imaging Network. Magnetic Resonance Imaging, 30(9):1323–1341, November 2012. doi:10.1016/j.mri.2012.05.001.
- More complex encoder is not all you need. arXiv preprint arXiv:2309.11139, 2023.
- D-former: A u-shaped dilated transformer for 3d medical image segmentation. Neural Computing and Applications, 35(2):1931–1944, 2023.
- Swinunetr-v2: Stronger swin transformers with stagewise convolutions for 3d medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 416–426. Springer, 2023.
- Superhuman accuracy on the snemi3d connectomics challenge. arXiv preprint arXiv:1706.00120, 2017.
- Deep learning semantic segmentation for high-resolution medical volumes. In 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), pages 1–9. IEEE, 2020.
- 3d u-net: learning dense volumetric segmentation from sparse annotation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, October 17-21, 2016, Proceedings, Part II 19, pages 424–432. Springer, 2016.