Papers
Topics
Authors
Recent
Search
2000 character limit reached

DCT-HistoTransformer: Efficient Lightweight Vision Transformer with DCT Integration for histopathological image analysis

Published 24 Oct 2024 in cs.CV | (2410.19166v1)

Abstract: In recent years, the integration of advanced imaging techniques and deep learning methods has significantly advanced computer-aided diagnosis (CAD) systems for breast cancer detection and classification. Transformers, which have shown great promise in computer vision, are now being applied to medical image analysis. However, their application to histopathological images presents challenges due to the need for extensive manual annotations of whole-slide images (WSIs), as these models require large amounts of data to work effectively, which is costly and time-consuming. Furthermore, the quadratic computational cost of Vision Transformers (ViTs) is particularly prohibitive for large, high-resolution histopathological images, especially on edge devices with limited computational resources. In this study, we introduce a novel lightweight breast cancer classification approach using transformers that operates effectively without large datasets. By incorporating parallel processing pathways for Discrete Cosine Transform (DCT) Attention and MobileConv, we convert image data from the spatial domain to the frequency domain to utilize the benefits such as filtering out high frequencies in the image, which reduces computational cost. This demonstrates the potential of our approach to improve breast cancer classification in histopathological images, offering a more efficient solution with reduced reliance on extensive annotated datasets. Our proposed model achieves an accuracy of 96.00% $\pm$ 0.48% for binary classification and 87.85% $\pm$ 0.93% for multiclass classification, which is comparable to state-of-the-art models while significantly reducing computational costs. This demonstrates the potential of our approach to improve breast cancer classification in histopathological images, offering a more efficient solution with reduced reliance on extensive annotated datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (23)
  1. F. Collins and H. Varmus, “A new initiative on precision medicine,” The New England journal of medicine, vol. 372, 01 2015.
  2. S. Reardon, “Us precision-medicine proposal sparks questions,” Nature, vol. 517, p. 540, 01 2015.
  3. Y. Cai, M. Landis, D. T. Laidley, A. Kornecki, A. Lum, and S. Li, “Multi-modal vertebrae recognition using transformed deep convolution network,” Computerized Medical Imaging and Graphics, vol. 51, pp. 11–19, 2016.
  4. T. I. Alshafeiy, A. Matich, C. M. Rochman, and J. A. Harvey, “Advantages and Challenges of Using Breast Biopsy Markers,” Journal of Breast Imaging, vol. 4, pp. 78–95, 08 2021.
  5. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” CoRR, vol. abs/2010.11929, 2020.
  6. H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers & distillation through attention,” 2021.
  7. K. Han, A. Xiao, E. Wu, J. Guo, C. XU, and Y. Wang, “Transformer in transformer,” in Advances in Neural Information Processing Systems (M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, eds.), vol. 34, pp. 15908–15919, Curran Associates, Inc., 2021.
  8. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-end object detection with transformers,” 2020.
  9. X. Zhu, W. Su, L. Lu, B. Li, X. Wang, and J. Dai, “Deformable detr: Deformable transformers for end-to-end object detection,” 2021.
  10. J. Hu, L. Cao, Y. Lu, S. Zhang, Y. Wang, K. Li, F. Huang, L. Shao, and R. Ji, “Istr: End-to-end instance segmentation with transformers,” 2021.
  11. H. Wang, Y. Zhu, H. Adam, A. Yuille, and L.-C. Chen, “Max-deeplab: End-to-end panoptic segmentation with mask transformers,” 2021.
  12. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” pp. 9992–10002, 2021.
  13. G. Wu, W.-S. Zheng, Y. Lu, and Q. Tian, “Pslt: A light-weight vision transformer with ladder self-attention and progressive shift,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 9, pp. 11120–11135, 2023.
  14. X. Liu, H. Peng, N. Zheng, Y. Yang, H. Hu, and Y. Yuan, “Efficientvit: Memory efficient vision transformer with cascaded group attention,” in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, June 2023.
  15. S. Mehta and M. Rastegari, “Mobilevit: Light-weight, general-purpose, and mobile-friendly vision transformer,” 2021.
  16. Z. Dai, H. Liu, Q. V. Le, and M. Tan, “Coatnet: Marrying convolution and attention for all data sizes,” 2021.
  17. C. Yang, S. Qiao, Q. Yu, X. Yuan, Y. Zhu, A. Yuille, H. Adam, and L.-C. Chen, “Moat: Alternating mobile convolution and attention brings strong vision models,” 2023.
  18. S. Wang, J. Gao, Z. Li, X. Zhang, and W. Hu, “A closer look at self-supervised lightweight vision transformers,” 2023.
  19. W. Wang, R. Jiang, N. Cui, Q. Li, F. Yuan, and Z. Xiao, “Semi-supervised vision transformer with adaptive token sampling for breast cancer classification,” Frontiers in Pharmacology, vol. 13, p. 929755, 2022.
  20. S. Tummala, J. Kim, and S. Kadry, “Breast-net: Multi-class classification of breast cancer from histopathological images using ensemble of swin transformers,” Mathematics, vol. 10, no. 21, 2022.
  21. G. Baroni, L. Rasotto, K. Roitero, A. Tulisso, C. di loreto, and V. Della Mea, “Optimizing vision transformers for histopathology: Pretraining and normalization in breast cancer classification,” Journal of Imaging, vol. 10, p. 108, 04 2024.
  22. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” 2015.
  23. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2015.

Summary

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.