Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

WiTUnet: A U-Shaped Architecture Integrating CNN and Transformer for Improved Feature Alignment and Local Information Fusion (2404.09533v2)

Published 15 Apr 2024 in cs.CV, cs.AI, and cs.LG

Abstract: Low-dose computed tomography (LDCT) has become the technology of choice for diagnostic medical imaging, given its lower radiation dose compared to standard CT, despite increasing image noise and potentially affecting diagnostic accuracy. To address this, advanced deep learning-based LDCT denoising algorithms have been developed, primarily using Convolutional Neural Networks (CNNs) or Transformer Networks with the Unet architecture. This architecture enhances image detail by integrating feature maps from the encoder and decoder via skip connections. However, current methods often overlook enhancements to the Unet architecture itself, focusing instead on optimizing encoder and decoder structures. This approach can be problematic due to the significant differences in feature map characteristics between the encoder and decoder, where simple fusion strategies may not effectively reconstruct images.In this paper, we introduce WiTUnet, a novel LDCT image denoising method that utilizes nested, dense skip pathways instead of traditional skip connections to improve feature integration. WiTUnet also incorporates a windowed Transformer structure to process images in smaller, non-overlapping segments, reducing computational load. Additionally, the integration of a Local Image Perception Enhancement (LiPe) module in both the encoder and decoder replaces the standard multi-layer perceptron (MLP) in Transformers, enhancing local feature capture and representation. Through extensive experimental comparisons, WiTUnet has demonstrated superior performance over existing methods in key metrics such as Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM), and Root Mean Square Error (RMSE), significantly improving noise removal and image quality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Layer normalization. arXiv preprint arXiv:1607.06450 .
  2. Low-dose ct with a residual encoder-decoder convolutional neural network. IEEE transactions on medical imaging 36, 2524–2535.
  3. Nbnet: Noise basis learning for image denoising with subspace projection, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4896–4906.
  4. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 .
  5. Toward convolutional blind denoising of real photographs, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1712–1722.
  6. Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.
  7. Du-gan: Generative adversarial networks with dual-domain u-net-based discriminators for low-dose ct denoising. IEEE Transactions on Instrumentation and Measurement 71, 1–12.
  8. Localvit: Bringing locality to vision transformers. arXiv preprint arXiv:2104.05707 .
  9. Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022.
  10. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 .
  11. The risk of cancer attributable to diagnostic medical radiation: Estimation for france in 2015. International journal of cancer 144, 2954–2963.
  12. Low-dose ct for the detection and classification of metastatic liver lesions: results of the 2016 low dose ct grand challenge. Medical physics 44, e339–e352.
  13. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems 32.
  14. U-net: Convolutional networks for biomedical image segmentation, in: Medical image computing and computer-assisted intervention–MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, Springer. pp. 234–241.
  15. Mobilenetv2: Inverted residuals and linear bottlenecks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520.
  16. Alara: is there a cause for alarm? reducing radiation risks from computed tomography scanning in children. Current opinion in pediatrics 20, 243–247.
  17. Self-attention with relative position representations. arXiv preprint arXiv:1803.02155 .
  18. Computed tomography—an increasing source of radiation exposure brenner dj, hall ej (columbia univ med ctr, new york) n engl j med 357: 2277-2284, 2007. Year Book of Pulmonary Disease 2009, 154–155.
  19. Attention-guided cnn for image denoising. Neural Networks 124, 117–129.
  20. Attention is all you need. Advances in neural information processing systems 30.
  21. Ctformer: convolution-free token2token dilated vision transformer for low-dose ct denoising. Physics in Medicine & Biology 68, 065012.
  22. Uformer: A general u-shaped transformer for image restoration, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 17683–17693.
  23. Low-dose ct denoising via sinogram inner-structure transformer. IEEE Transactions on Medical Imaging 42, 910–921.
  24. Low-dose ct image denoising using a generative adversarial network with wasserstein distance and perceptual loss. IEEE transactions on medical imaging 37, 1348–1357.
  25. Domain progressive 3d residual convolution network to improve low-dose ct imaging. IEEE transactions on medical imaging 38, 2903–2913.
  26. Incorporating convolution designs into visual transformers, in: Proceedings of the IEEE/CVF international conference on computer vision, pp. 579–588.
  27. Dual adversarial network: Toward real-world noise removal and noise generation, in: Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16, Springer. pp. 41–58.
  28. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising. IEEE transactions on image processing 26, 3142–3155.
  29. Transct: dual-path transformer for low dose computed tomography, in: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part VI 24, Springer. pp. 55–64.
  30. Unet++: A nested u-net architecture for medical image segmentation, in: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, Springer. pp. 3–11.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Bin Wang (750 papers)
  2. Fei Deng (35 papers)
  3. Peifan Jiang (7 papers)
  4. Shuang Wang (159 papers)
  5. Xiao Han (127 papers)
  6. Zhixuan Zhang (8 papers)
Citations (3)