Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ParaTransCNN: Parallelized TransCNN Encoder for Medical Image Segmentation (2401.15307v1)

Published 27 Jan 2024 in eess.IV and cs.CV

Abstract: The convolutional neural network-based methods have become more and more popular for medical image segmentation due to their outstanding performance. However, they struggle with capturing long-range dependencies, which are essential for accurately modeling global contextual correlations. Thanks to the ability to model long-range dependencies by expanding the receptive field, the transformer-based methods have gained prominence. Inspired by this, we propose an advanced 2D feature extraction method by combining the convolutional neural network and Transformer architectures. More specifically, we introduce a parallelized encoder structure, where one branch uses ResNet to extract local information from images, while the other branch uses Transformer to extract global information. Furthermore, we integrate pyramid structures into the Transformer to extract global information at varying resolutions, especially in intensive prediction tasks. To efficiently utilize the different information in the parallelized encoder at the decoder stage, we use a channel attention module to merge the features of the encoder and propagate them through skip connections and bottlenecks. Intensive numerical experiments are performed on both aortic vessel tree, cardiac, and multi-organ datasets. By comparing with state-of-the-art medical image segmentation methods, our method is shown with better segmentation accuracy, especially on small organs. The code is publicly available on https://github.com/HongkunSun/ParaTransCNN.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Recurrent residual u-net for medical image segmentation. Journal of Medical Imaging 6, 014006–014006.
  2. Transnorm: Transformer provides a strong spatial normalization mechanism for a deep segmentation model. IEEE Access 10, 108205–108215.
  3. Dae-former: Dual attention-guided efficient transformer for medical image segmentation. arXiv preprint arXiv:2212.13504 .
  4. Transdeeplab: Convolution-free transformer-based deeplab v3+ for medical image segmentation, in: Predictive Intelligence in Medicine: 5th International Workshop, PRIME 2022, Held in Conjunction with MICCAI 2022, Springer. pp. 91–102.
  5. Enhancing medical image segmentation with transception: A multi-scale feature fusion approach. arXiv preprint arXiv:2301.10847 .
  6. Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE Transactions on Medical Imaging 37, 2514–2525.
  7. Breast mass segmentation in ultrasound with selective kernel u-net convolutional neural network. Biomedical Signal Processing and Control 61, 102027.
  8. Swin-unet: Unet-like pure transformer for medical image segmentation, in: Computer Vision–ECCV 2022 Workshops, Springer. pp. 205–218.
  9. Icl-net: Global and local inter-pixel correlations learning network for skin lesion segmentation. IEEE Journal of Biomedical and Health Informatics 27, 145–156.
  10. Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 .
  11. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 834–848.
  12. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 .
  13. Encoder-decoder with atrous separable convolution for semantic image segmentation, in: Proceedings of the European Conference on Computer Vision (ECCV), pp. 801–818.
  14. Joint margin adaption and multiscale feature fusion for covid-19 ct images segmentation. Biomedical Signal Processing and Control 91, 105912.
  15. Hadcnet: Automatic segmentation of covid-19 infection based on a hybrid attention dense connected network with dilated convolution. Computers in Biology and Medicine 149, 105981.
  16. Polyp-pvt: Polyp segmentation with pyramid vision transformers. arXiv preprint arXiv:2108.06932 .
  17. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 .
  18. Deau-net: Attention networks based on dual encoder for medical image segmentation. Computers in Biology and Medicine 150, 106197.
  19. Learned snakes for 3d image segmentation. Signal Processing 183, 108013.
  20. Unetr: Transformers for 3d medical image segmentation, in: Proceedings of the IEEE/CVF winter Conference on Applications of Computer Vision, pp. 574–584.
  21. Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778.
  22. Hiformer: Hierarchical multi-scale representations using transformers for medical image segmentation, in: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 6202–6212.
  23. Squeeze-and-excitation networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141.
  24. Missformer: An effective transformer for 2d medical image segmentation. IEEE Transactions on Medical Imaging .
  25. Batch normalization: Accelerating deep network training by reducing internal covariate shift, in: International Conference on Machine Learning, pmlr. pp. 448–456.
  26. Miccai multi-atlas labeling beyond the cranial vault–workshop and challenge, in: Proc. MICCAI Multi-Atlas Labeling Beyond Cranial Vault—Workshop Challenge, p. 12.
  27. Cr-unet: A composite network for ovary and follicle segmentation in ultrasound images. IEEE Journal of Biomedical and Health Informatics 24, 974–983.
  28. Attransunet: An enhanced hybrid transformer architecture for ultrasound and histopathology image segmentation. Computers in Biology and Medicine 152, 106365.
  29. Learning multi-level structural information for small organ segmentation. Signal Processing 193, 108418.
  30. Transformer in convolutional neural networks. arXiv preprint arXiv:2106.03180 3.
  31. Effective 3d boundary learning via a nonlocal deformable network, in: 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), IEEE. pp. 1–5.
  32. Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022.
  33. V-net: Fully convolutional neural networks for volumetric medical image segmentation, in: 2016 fourth International Conference on 3D Vision (3DV), pp. 565–571.
  34. Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 .
  35. Detection, segmentation, simulation and visualization of aortic dissections: A review. Medical Image Analysis 65, 101773.
  36. Long-range 3d self-attention for mri prostate segmentation, in: 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), IEEE. pp. 1–5.
  37. Gfanet: Gated fusion attention network for skin lesion segmentation. Computers in Biology and Medicine 155, 106462.
  38. Avt: Multicenter aortic vessel tree cta dataset collection with ground truth segmentation masks. Data in Brief 40, 107801.
  39. U-net: Convolutional networks for biomedical image segmentation, in: 18th International Conference on Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015, Springer. pp. 234–241.
  40. Transformers in medical imaging: A survey. Medical Image Analysis , 102802.
  41. Fcrb u-net: A novel fully connected residual block u-net for fetal cerebellum ultrasound image segmentation. Computers in Biology and Medicine 148, 105693.
  42. Multi-scale self-guided attention for medical image segmentation. IEEE Journal of Biomedical and Health Informatics 25, 121–130.
  43. Training data-efficient image transformers & distillation through attention, in: International Conference on Machine Learning, PMLR. pp. 10347–10357.
  44. Level-set evolution for medical image segmentation with alternating direction method of multipliers. Signal Processing 211, 109105.
  45. Medical image segmentation using deep learning: A survey. IET Image Processing 16, 1243–1267.
  46. Transbts: Multimodal brain tumor segmentation using transformer, in: 24th International Conference on Medical Image Computing and Computer Assisted Intervention, Springer. pp. 109–119.
  47. Pyramid vision transformer: A versatile backbone for dense prediction without convolutions, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 568–578.
  48. Msraformer: Multiscale spatial reverse attention network for polyp segmentation. Computers in Biology and Medicine 151, 106274.
  49. Vitae: Vision transformer advanced by exploring intrinsic inductive bias. Advances in Neural Information Processing Systems 34, 28522–28535.
  50. Cswin-pnet: A cnn-swin transformer combined pyramid network for breast lesion segmentation in ultrasound images. Expert Systems with Applications 213, 119024.
  51. Boundary constraint network with cross layer feature integration for polyp segmentation. IEEE Journal of Biomedical and Health Informatics 26, 4090–4099.
  52. Accpg-net: A skin lesion segmentation network with adaptive channel-context-aware pyramid attention and global feature fusion. Computers in Biology and Medicine , 106580.
  53. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6881–6890.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Hongkun Sun (2 papers)
  2. Jing Xu (244 papers)
  3. Yuping Duan (18 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.