Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Narrowing the semantic gaps in U-Net with learnable skip connections: The case of medical image segmentation (2312.15182v1)

Published 23 Dec 2023 in eess.IV, cs.CV, and cs.LG

Abstract: Most state-of-the-art methods for medical image segmentation adopt the encoder-decoder architecture. However, this U-shaped framework still has limitations in capturing the non-local multi-scale information with a simple skip connection. To solve the problem, we firstly explore the potential weakness of skip connections in U-Net on multiple segmentation tasks, and find that i) not all skip connections are useful, each skip connection has different contribution; ii) the optimal combinations of skip connections are different, relying on the specific datasets. Based on our findings, we propose a new segmentation framework, named UDTransNet, to solve three semantic gaps in U-Net. Specifically, we propose a Dual Attention Transformer (DAT) module for capturing the channel- and spatial-wise relationships to better fuse the encoder features, and a Decoder-guided Recalibration Attention (DRA) module for effectively connecting the DAT tokens and the decoder features to eliminate the inconsistency. Hence, both modules establish a learnable connection to solve the semantic gaps between the encoder and the decoder, which leads to a high-performance segmentation model for medical images. Comprehensive experimental results indicate that our UDTransNet produces higher evaluation scores and finer segmentation results with relatively fewer parameters over the state-of-the-art segmentation methods on different public datasets. Code: https://github.com/McGregorWwww/UDTransNet.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: is the problem solved? IEEE transactions on medical imaging 37, 2514–2525.
  2. Swin-unet: Unet-like pure transformer for medical image segmentation. https://arxiv.org/abs/2105.05537.
  3. Collaborative learning of weakly-supervised domain adaptation for diabetic retinopathy grading on retinal images. Computers in Biology and Medicine 144, 105341.
  4. TransUNet: Transformers make strong encoders for medical image segmentation. https://arxiv.org/abs/2102.04306.
  5. An end-to-end approach to segmentation in medical images with cnn and posterior-crf. Medical Image Analysis 76, 102311.
  6. Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic). https://arxiv.org/abs/1902.03368.
  7. An image is worth 16x16 words: Transformers for image recognition at scale, in: Int. Conf. Learn. Repr. (ICLR).
  8. Inf-net: Automatic COVID-19 lung infection segmentation from CT images. IEEE Transactions on Medical Imaging 39, 2626–2637.
  9. UTNet: A hybrid transformer architecture for medical image segmentation, in: Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI), p. 61–71.
  10. UNETR: Transformers for 3d medical image segmentation, in: WACV, pp. 574–584.
  11. Metricunet: Synergistic image- and voxel-level learning for precise prostate segmentation via online sampling. Medical Image Analysis 71, 102039.
  12. MultiResUNet : Rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw. 121, 74–87.
  13. Multi-compound transformer for accurate biomedical image segmentation, in: Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI), p. 326–336.
  14. Learning multi-scale synergic discriminative features for prostate image segmentation. Pattern Recognition 126, 108556.
  15. A dataset and a technique for generalized nuclear segmentation for computational pathology. IEEE Transactions on Medical Imaging 36, 1550–1560.
  16. 2015 miccai multi-atlas labeling beyond the cranial vault–workshop and challenge. 10.7303/syn3193805.
  17. Swin transformer: Hierarchical vision transformer using shifted windows, in: Proc. the IEEE/CVF Int. Conf. Comput. Vis. (ICCV), pp. 10012–10022.
  18. SGDR: Stochastic gradient descent with warm restarts, in: Int. Conf. Learn. Repr. (ICLR).
  19. Attention u-net: Learning where to look for the pancreas, in: MIDL, pp. 1–10.
  20. Nenet: Nested efficientnet and adversarial learning for joint optic disc and cup segmentation. Medical Image Analysis 74, 102253.
  21. Unet#: a unet-like redesigning skip connections for medical image segmentation. arXiv preprint arXiv:2205.11759 .
  22. U-net: Convolutional networks for biomedical image segmentation, in: Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI), p. 234–241.
  23. Gland segmentation in colon histology images: The GlaS challenge contest. Med. Image Anal. 35, 489–502.
  24. Select, attend, and transfer: light, learnable skip connections, in: Machine Learning in Medical Imaging: 10th International Workshop, MLMI 2019, Held in Conjunction with MICCAI 2019, Shenzhen, China, October 13, 2019, Proceedings 10, Springer. pp. 417–425.
  25. Instance normalization: The missing ingredient for fast stylization. https://arxiv.org/abs/1607.08022.
  26. Medical transformer: Gated axial-attention for medical image segmentation, in: Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI), pp. 36–46.
  27. Uctransnet: Rethinking the skip connections in u-net from a channel-wise perspective with transformer, in: AAAI.
  28. Dhc: Dual-debiased heterogeneous co-training framework for class-imbalanced semi-supervised medical image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 582–591.
  29. Towards generic semi-supervised framework for volumetric medical image segmentation. arXiv preprint arXiv:2310.11320 .
  30. Boundary-aware transformers for skin lesion segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention, Springer. pp. 206–216.
  31. Non-local neural networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7794–7803.
  32. Non-local u-nets for biomedical image segmentation, in: Proceedings of the AAAI conference on artificial intelligence, pp. 6315–6322.
  33. Histoseg: Quick attention with multi-loss function for multi-structure segmentation in digital histology images, in: 2022 12th International Conference on Pattern Recognition Systems (ICPRS), IEEE. pp. 1–7.
  34. Cbam: convolutional block attention module. in proceedings of the european conference on computer vision (eccv): 3-19.
  35. Fat-net: Feature adaptive transformers for automated skin lesion segmentation. Medical Image Analysis 76, 102327.
  36. A multi-branch hybrid transformer networkfor corneal endothelial cell segmentation, in: Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI), p. 99–108.
  37. TransFuse: Fusing transformers and CNNs for medical image segmentation, in: Proc. Int. Conf. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI), p. 14–24.
  38. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers, in: Proc. Conf. Comput. Vis. Pattern Recognit. (CVPR), p. 6881–6890.
  39. UNet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE Trans. Med. Imag. 39, 1856–1867.
Citations (5)

Summary

We haven't generated a summary for this paper yet.