Papers
Topics
Authors
Recent
2000 character limit reached

Medical Visual Prompting (MVP): A Unified Framework for Versatile and High-Quality Medical Image Segmentation (2404.01127v1)

Published 1 Apr 2024 in cs.CV and cs.AI

Abstract: Accurate segmentation of lesion regions is crucial for clinical diagnosis and treatment across various diseases. While deep convolutional networks have achieved satisfactory results in medical image segmentation, they face challenges such as loss of lesion shape information due to continuous convolution and downsampling, as well as the high cost of manually labeling lesions with varying shapes and sizes. To address these issues, we propose a novel medical visual prompting (MVP) framework that leverages pre-training and prompting concepts from NLP. The framework utilizes three key components: Super-Pixel Guided Prompting (SPGP) for superpixelating the input image, Image Embedding Guided Prompting (IEGP) for freezing patch embedding and merging with superpixels to provide visual prompts, and Adaptive Attention Mechanism Guided Prompting (AAGP) for pinpointing prompt content and efficiently adapting all layers. By integrating SPGP, IEGP, and AAGP, the MVP enables the segmentation network to better learn shape prompting information and facilitates mutual learning across different tasks. Extensive experiments conducted on five datasets demonstrate superior performance of this method in various challenging medical image tasks, while simplifying single-task medical segmentation models. This novel framework offers improved performance with fewer parameters and holds significant potential for accurate segmentation of lesion regions in various medical tasks, making it clinically valuable.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. F. Bongratz, A.-M. Rickmann, S. Pölsterl, and C. Wachinger, “Vox2cortex: Fast explicit reconstruction of cortical surfaces from 3d mri scans with geometric deep neural networks,” in CVPR, 2022, pp. 20 741–20 751.
  2. X. Chen, B. Lei, C.-M. Pun, and S. Wang, “Brain diffuser: An end-to-end brain image to brain network pipeline,” in PRCV, 2023, pp. 16–26.
  3. C. Gong, C. Jing, X. Chen, C.-M. Pun, G. Huang, A. Saha, M. Nieuwoudt, H.-X. Li, Y. Hu, and S. Wang, “Generative ai for brain image computing and brain network computing: a review,” Frontiers in Neuroscience, vol. 17, 2023.
  4. G. Huang, X. Chen, Y. Shen, and S. Wang, “Mr image super-resolution using wavelet diffusion for predicting alzheimer’s disease,” in BI, 2023.
  5. X. Zhao, L. Zhang, and H. Lu, “Automatic polyp segmentation via multi-scale subtraction network,” in MICCAI, 2021, pp. 120–130.
  6. L. Morra, L. Piano, F. Lamberti, and T. Tommasi, “Bridging the gap between natural and medical images through deep colorization,” in ICPR, 2021, pp. 835–842.
  7. Y. Wang, J. Cheng, Y. Chen, S. Shao, L. Zhu, Z. Wu, T. Liu, and H. Zhu, “Fvp: Fourier visual prompting for source-free unsupervised domain adaptation of medical image segmentation,” IEEE Transactions on Medical Imaging, vol. 42, pp. 3738–3751, 2023.
  8. T. Zhou, X. Chen, Y. Shen, M. Nieuwoudt, C.-M. Pun, and S. Wang, “Generative ai enables eeg data augmentation for alzheimer’s disease detection via diffusion model,” in ISPCE-ASIA, 2023, pp. 1–6.
  9. M. Jia, L. Tang, B.-C. Chen, C. Cardie, S. J. Belongie, B. Hariharan, and S. N. Lim, “Visual prompt tuning,” ArXiv, 2022.
  10. S. Minaee, Y. Boykov, F. Porikli, A. Plaza, N. Kehtarnavaz, and D. Terzopoulos, “Image segmentation using deep learning: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 7, pp. 3523–3542, 2022.
  11. X. Zhao, L. Zhang, and H. Lu, “Automatic polyp segmentation via multi-scale subtraction network,” in MICCAI, 2021.
  12. X. Zhao, H. Jia, Y. Pang, L. Lv, F. Tian, L. Zhang, W. Sun, and H. Lu, “M2snet: Multi-scale in multi-scale subtraction network for medical image segmentation,” ArXiv, 2023.
  13. Y. Peng, D. Yu, and Y. Guo, “Mshnet: Multi-scale feature combined with h-network for medical image segmentation,” Biomed. Signal Process. Control., vol. 79, p. 104167, 2023.
  14. A. Rogers, O. Kovaleva, and A. Rumshisky, “A primer in bertology: What we know about how bert works,” Transactions of the Association for Computational Linguistics, vol. 8, pp. 842–866, 2020.
  15. T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. J. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, “Language models are few-shot learners,” ArXiv, 2020.
  16. P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” ACM Computing Surveys, vol. 55, pp. 1–35, 2021.
  17. W. Liu, X. Shen, H. Li, X. Bi, B. Liu, C.-M. Pun, and X. Cun, “Depth-aware test-time training for zero-shot video object segmentation,” ArXiv, 2024.
  18. M. Sandler, A. Zhmoginov, M. Vladymyrov, and A. Jackson, “Fine-tuning image transformers using learnable memory,” in CVPR, 2022, pp. 12 145–12 154.
  19. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” ArXiv, 2020.
  20. S. Luo, X. Chen, W. Chen, Z. Li, S. Wang, and C.-M. Pun, “Devignet: High-resolution vignetting removal via a dual aggregated fusion transformer with adaptive channel expansion,” in AAAI, 2023.
  21. X. Chen, X. Cun, C.-M. Pun, and S. Wang, “Shadocnet: Learning spatial-aware tokens in transformer for document shadow removal,” ICASSP, pp. 1–5, 2022.
  22. Z. Li, X. Chen, C.-M. Pun, and X. Cun, “High-resolution document shadow removal via a large-scale real-world dataset and a frequency-aware shadow erasing net,” ICCV, pp. 12 415–12 424, 2023.
  23. Z. Li, X. Chen, S. Wang, and C.-M. Pun, “A large-scale film style dataset for learning multi-frequency driven film enhancement,” in IJCAI, 2023, pp. 1160–1168.
  24. X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, and P.-A. Heng, “H-denseunet: Hybrid densely connected unet for liver and tumor segmentation from ct volumes,” IEEE Transactions on Medical Imaging, vol. 37, no. 12, pp. 2663–2674, 2018.
  25. E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640–651, 2017.
  26. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in MICCAI, 2015, pp. 234–241.
  27. T. Kim, H. Lee, and D. Kim, “Uacanet: Uncertainty augmented context attention for polyp segmentation,” ACM MM, 2021.
  28. E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Álvarez, and P. Luo, “Segformer: Simple and efficient design for semantic segmentation with transformers,” in NIPS, 2021.
  29. V. Jampani, D. Sun, M.-Y. Liu, M.-H. Yang, and J. Kautz, “Superpixel sampling networks,” in ECCV, 2018, pp. 363–380.
  30. R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, “Slic superpixels compared to state-of-the-art superpixel methods,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2274–2282, 2012.
  31. H. Zhang, M. Lin, G. Yang, and L. Zhang, “Escnet: An end-to-end superpixel-enhanced change detection network for very-high-resolution remote sensing images,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 1, pp. 28–42, 2023.
  32. W. Liu, X. Shen, C.-M. Pun, and X. Cun, “Explicit visual prompting for low-level structure segmentations,” CVPR, pp. 19 434–19 445, 2023.
  33. X. Chen, C.-M. Pun, and S. Wang, “Medprompt: Cross-modal prompting for multi-task medical image translation,” ArXiv, 2023.
  34. A. Vaswani, N. M. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in NIPS, 2017.
  35. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in CVPR, 2009, pp. 248–255.
  36. Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: A nested u-net architecture for medical image segmentation,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, 2018, pp. 3–11.
  37. N. K. Tomar, D. Jha, M. A. Riegler, H. D. Johansen, D. Johansen, J. Rittscher, P. Halvorsen, and S. Ali, “Fanet: A feedback attention network for improved biomedical image segmentation,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 11, pp. 9375–9388, 2023.
  38. J. M. J. Valanarasu, V. A. Sindagi, I. Hacihaliloglu, and V. M. Patel, “Kiu-net: Overcomplete convolutional architectures for biomedical image and volumetric segmentation,” IEEE Transactions on Medical Imaging, vol. 41, no. 4, pp. 965–976, 2022.
  39. F. Tang, L. Wang, C. Y. Ning, M. Xian, and J. Ding, “Cmu-net: A strong convmixer-based medical ultrasound image segmentation network,” ISBI, pp. 1–5, 2022.
  40. L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder-decoder with atrous separable convolution for semantic image segmentation,” in ECCV, 2018.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.