MedSegDiff-V2: Diffusion based Medical Image Segmentation with Transformer (2301.11798v2)
Abstract: The Diffusion Probabilistic Model (DPM) has recently gained popularity in the field of computer vision, thanks to its image generation applications, such as Imagen, Latent Diffusion Models, and Stable Diffusion, which have demonstrated impressive capabilities and sparked much discussion within the community. Recent investigations have further unveiled the utility of DPM in the domain of medical image analysis, as underscored by the commendable performance exhibited by the medical image segmentation model across various tasks. Although these models were originally underpinned by a UNet architecture, there exists a potential avenue for enhancing their performance through the integration of vision transformer mechanisms. However, we discovered that simply combining these two models resulted in subpar performance. To effectively integrate these two cutting-edge techniques for the Medical image segmentation, we propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2. We verify its effectiveness on 20 medical image segmentation tasks with different image modalities. Through comprehensive evaluation, our approach demonstrates superiority over prior state-of-the-art (SOTA) methodologies. Code is released at https://github.com/KidsWithTokens/MedSegDiff
- Segdiff: Image segmentation with diffusion probabilistic models. arXiv preprint arXiv:2112.00390.
- The lung image database consortium (LIDC) and image database resource initiative (IDRI): a completed reference database of lung nodules on CT scans. Medical physics, 38(2): 915–931.
- The rsna-asnr-miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314.
- Swin-unet: Unet-like pure transformer for medical image segmentation. In European conference on computer vision, 205–218. Springer.
- Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision, 9650–9660.
- Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306.
- Med3d: Transfer learning for 3d medical image analysis. arXiv preprint arXiv:1904.00625.
- Ultrasonic thyroid nodule detection method based on U-Net network. Computer Methods and Programs in Biomedicine, 199: 105906.
- REFUGE2 Challenge: Treasure for Multi-Domain Learning in Glaucoma Assessment. arXiv preprint arXiv:2202.08994.
- Multi-organ segmentation over partially labeled datasets with multi-scale feature abstraction. IEEE Transactions on Medical Imaging, 39(11): 3619–3629.
- Multi-task learning for thyroid nodule segmentation with thyroid region prior. In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI), 257–261. IEEE.
- Accelerating Diffusion Models via Pre-segmentation Diffusion Sampling for Medical Image Segmentation. arXiv preprint arXiv:2210.17408.
- Unetr: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 574–584.
- Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33: 6840–6851.
- nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2): 203–211.
- Learning calibrated medical image segmentation via multi-rater agreement modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12341–12351.
- Amos: A large-scale abdominal multi-organ benchmark for versatile medical image segmentation. arXiv preprint arXiv:2206.08023.
- SwinBTS: A method for 3D multimodal brain tumor segmentation using swin transformer. Brain sciences, 12(6): 797.
- Diffusion adversarial representation learning for self-supervised vessel segmentation. arXiv preprint arXiv:2209.14566.
- A probabilistic u-net for segmentation of ambiguous images. Advances in neural information processing systems, 31.
- Ds-transunet: Dual swin transformer u-net for medical image segmentation. IEEE Transactions on Instrumentation and Measurement, 71: 1–15.
- Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101.
- NeRF: Representing scenes as neural radiance fields for view synthesis. In The European Conference on Computer Vision (ECCV).
- Milton, M. A. A. 2019. Automated skin lesion classification using ensemble of deep neural networks in isic 2018: Skin lesion analysis towards melanoma detection challenge. arXiv preprint arXiv:1901.10802.
- Intriguing properties of vision transformers. Advances in Neural Information Processing Systems, 34: 23296–23308.
- An open access thyroid ultrasound image database. In 10th International symposium on medical information processing and analysis, volume 9287, 188–193. SPIE.
- Ambiguous medical image segmentation using diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11536–11546.
- Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125.
- High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10684–10695.
- Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. arXiv preprint arXiv:2205.11487.
- Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems, 33: 7462–7473.
- Self-supervised pre-training of swin transformers for 3d medical image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 20730–20740.
- Boundary-aware transformers for skin lesion segmentation. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, 206–216. Springer.
- Boundary and entropy-driven adversarial learning for fundus image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 102–110. Springer.
- Transbts: Multimodal brain tumor segmentation using transformer. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 109–119. Springer.
- Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation. IEEE transactions on medical imaging, 23(7): 903–921.
- Diffusion Models for Implicit Image Segmentation Ensembles. arXiv preprint arXiv:2112.03145.
- Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (ECCV), 3–19.
- FAT-Net: Feature adaptive transformers for automated skin lesion segmentation. Medical image analysis, 76: 102327.
- SeATrans: Learning Segmentation-Assisted Diagnosis Model via Transformer. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part II, 677–687. Springer.
- MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model. arXiv preprint arXiv:2211.00611.
- Universal, transferable and targeted adversarial attacks. arXiv preprint arXiv:1908.11332.
- Robust optic disc and cup segmentation with deep learning for glaucoma detection. Computerized Medical Imaging and Graphics, 74: 61–71.
- Scaling vision transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 12104–12113.