Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Medical Imaging Met Self-Attention: A Love Story That Didn't Quite Work Out (2404.12295v1)

Published 18 Apr 2024 in cs.CV

Abstract: A substantial body of research has focused on developing systems that assist medical professionals during labor-intensive early screening processes, many based on convolutional deep-learning architectures. Recently, multiple studies explored the application of so-called self-attention mechanisms in the vision domain. These studies often report empirical improvements over fully convolutional approaches on various datasets and tasks. To evaluate this trend for medical imaging, we extend two widely adopted convolutional architectures with different self-attention variants on two different medical datasets. With this, we aim to specifically evaluate the possible advantages of additional self-attention. We compare our models with similarly sized convolutional and attention-based baselines and evaluate performance gains statistically. Additionally, we investigate how including such layers changes the features learned by these models during the training. Following a hyperparameter search, and contrary to our expectations, we observe no significant improvement in balanced accuracy over fully convolutional models. We also find that important features, such as dermoscopic structures in skin lesion images, are still not learned by employing self-attention. Finally, analyzing local explanations, we confirm biased feature usage. We conclude that merely incorporating attention is insufficient to surpass the performance of existing fully convolutional methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. From detection of individual metastases to classification of lymph node status at the patient level: the camelyon17 challenge. IEEE Transactions on Medical Imaging.
  2. Artificial intelligence in digital pathology — new tools for diagnosis and precision oncology. Nature Reviews Clinical Oncology, pages 1–13.
  3. Dermoscopy image analysis: overview and future directions. IEEE journal of biomedical and health informatics, 23(2):474–478.
  4. Skin Lesion Analysis Toward Melanoma Detection 2018: A Challenge Hosted by the International Skin Imaging Collaboration (ISIC). arXiv:1902.03368 [cs]. arXiv: 1902.03368.
  5. Vision transformers need registers. arXiv preprint arXiv:2309.16588.
  6. Kernel measures of conditional dependence. Advances in neural information processing systems, 20.
  7. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
  8. Fully transformer network for skin lesion analysis. Medical Image Analysis, 77:102357.
  9. ISIC (2022). Isic archive home page. Last accessed 23 July 2023.
  10. A generalized deep learning framework for whole-slide image segmentation and analysis. Scientific Reports, 11:11579.
  11. An image is worth 16x16 words: Transformers for image recognition at scale.
  12. Lesionaid: Vision transformers-based skin lesion generation and classification.
  13. A robust and effective approach towards accurate metastasis detection and pn-stage classification in breast cancer. In Frangi, A. F., Schnabel, J. A., Davatzikos, C., Alberola-López, C., and Fichtinger, G., editors, Medical Image Computing and Computer Assisted Intervention – MICCAI 2018, pages 841–850, Cham. Springer International Publishing.
  14. An Overview of Melanoma Detection in Dermoscopy Images Using Image Processing and Machine Learning. arXiv:1601.07843 [cs, stat]. arXiv: 1601.07843.
  15. The ABCD rule of dermatoscopy. High prospective value in the diagnosis of doubtful melanocytic skin lesions. Journal of the American Academy of Dermatology, 30(4):551–559.
  16. Pad-ufes-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones. Data in Brief, 32:106221.
  17. Investigating neural network training on a feature level using conditional independence. In ECCV Workshop on Causality in Vision (ECCV-WS), pages 383–399, Cham. Springer Nature Switzerland.
  18. Data augmentation for skin lesion analysis.
  19. Stand-alone self-attention in vision models. Advances in Neural Information Processing Systems, 32.
  20. Conditional dependence tests reveal the usage of abcd rule features and bias variables in automatic skin lesion classification. In CVPR ISIC Skin Image Analysis Workshop (CVPR-WS), pages 1810–1819.
  21. Determining the relevance of features for deep neural networks. In European Conference on Computer Vision, pages 330–346. Springer.
  22. Runge, J. (2018). Conditional independence testing based on a nearest-neighbor estimator of conditional mutual information. In International Conference on Artificial Intelligence and Statistics. PMLR.
  23. Imagenet large scale visual recognition challenge. International journal of computer vision, 115:211–252.
  24. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510–4520.
  25. The study of nevi in children: Principles learned and implications for melanoma diagnosis. Journal of the American Academy of Dermatology, 75(4):813–823.
  26. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision, pages 618–626.
  27. Society, A. C. (2022). Cancer facts & figures 2022. Last accessed 02 August 2022.
  28. Approximate kernel-based conditional independence tests for fast non-parametric causal discovery. Journal of Causal Inference.
  29. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR.
  30. Mlp-mixer: An all-mlp architecture for vision. CoRR, abs/2105.01601.
  31. Patches are all you need? CoRR, abs/2201.09792.
  32. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data, 5(1):180161.
  33. Attention is all you need. Advances in neural information processing systems, 30.
  34. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7794–7803.
  35. Welch, B. L. (1947). The generalization of ‘student’s’ problem when several different population variances are involved. Biometrika, 34(1/2):28–35.
  36. Efficient streaming language models with attention sinks. arXiv preprint arXiv:2309.17453.
  37. A novel vision transformer model for skin cancer classification. Neural Processing Letters, pages 1–17.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Tristan Piater (2 papers)
  2. Niklas Penzel (7 papers)
  3. Gideon Stein (6 papers)
  4. Joachim Denzler (87 papers)
Citations (1)