Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-modal Learning with Missing Modality in Predicting Axillary Lymph Node Metastasis (2401.01553v1)

Published 3 Jan 2024 in eess.IV and cs.CV

Abstract: Multi-modal Learning has attracted widespread attention in medical image analysis. Using multi-modal data, whole slide images (WSIs) and clinical information, can improve the performance of deep learning models in the diagnosis of axillary lymph node metastasis. However, clinical information is not easy to collect in clinical practice due to privacy concerns, limited resources, lack of interoperability, etc. Although patient selection can ensure the training set to have multi-modal data for model development, missing modality of clinical information can appear during test. This normally leads to performance degradation, which limits the use of multi-modal models in the clinic. To alleviate this problem, we propose a bidirectional distillation framework consisting of a multi-modal branch and a single-modal branch. The single-modal branch acquires the complete multi-modal knowledge from the multi-modal branch, while the multi-modal learns the robust features of WSI from the single-modal. We conduct experiments on a public dataset of Lymph Node Metastasis in Early Breast Cancer to validate the method. Our approach not only achieves state-of-the-art performance with an AUC of 0.861 on the test set without missing data, but also yields an AUC of 0.842 when the rate of missing modality is 80\%. This shows the effectiveness of the approach in dealing with multi-modal data and missing modality. Such a model has the potential to improve treatment decision-making for early breast cancer patients who have axillary lymph node metastatic status.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Y. Hu, F. Su, K. Dong, X. Wang, X. Zhao, Y. Jiang, J. Li, J. Ji, and Y. Sun, “Deep learning system for lymph node quantification and metastatic cancer identification from whole-slide pathology images,” Gastric Cancer, vol. 24, pp. 868–877, 2021.
  2. Y. Zhao, F. Yang, Y. Fang, H. Liu, N. Zhou, J. Zhang, J. Sun, S. Yang, B. Menze, X. Fan et al., “Predicting lymph node metastasis using histopathological images based on multiple instance learning with deep graph convolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 4837–4846.
  3. S. A. Harmon, T. H. Sanford, G. T. Brown, C. Yang, S. Mehralivand, J. M. Jacob, V. A. Valera, J. H. Shih, P. K. Agarwal, P. L. Choyke et al., “Multiresolution application of artificial intelligence in digital pathology for prediction of positive lymph nodes from primary tumors in bladder cancer,” JCO clinical cancer informatics, vol. 4, pp. 367–382, 2020.
  4. H. Li, F. Yang, X. Xing, Y. Zhao, J. Zhang, Y. Liu, M. Han, J. Huang, L. Wang, and J. Yao, “Multi-modal multi-instance learning using weakly correlated histopathological images and tabular clinical information,” in Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part VIII 24.   Springer, 2021, pp. 529–539.
  5. O. Dalmaz, M. Yurt, and T. Çukur, “Resvit: residual vision transformers for multimodal medical image synthesis,” IEEE Transactions on Medical Imaging, vol. 41, no. 10, pp. 2598–2614, 2022.
  6. S. Zhang, J. Zhang, B. Tian, T. Lukasiewicz, and Z. Xu, “Multi-modal contrastive mutual learning and pseudo-label re-learning for semi-supervised medical image segmentation,” Medical Image Analysis, vol. 83, p. 102656, 2023.
  7. J. N. Acosta, G. J. Falcone, P. Rajpurkar, and E. J. Topol, “Multimodal biomedical ai,” Nature Medicine, vol. 28, no. 9, pp. 1773–1784, 2022.
  8. H. Zheng, Z. Lin, Q. Zhou, X. Peng, J. Xiao, C. Zu, Z. Jiao, and Y. Wang, “Multi-transsp: Multimodal transformer for survival prediction of nasopharyngeal carcinoma patients,” in Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part VII.   Springer, 2022, pp. 234–243.
  9. R. Hong, W. Liu, D. DeLair, N. Razavian, and D. Fenyö, “Predicting endometrial cancer subtypes and molecular features from histopathology images using multi-resolution deep learning models,” Cell Reports Medicine, vol. 2, no. 9, p. 100400, 2021.
  10. F. Xu, C. Zhu, W. Tang, Y. Wang, Y. Zhang, J. Li, H. Jiang, Z. Shi, J. Liu, and M. Jin, “Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides,” Frontiers in oncology, vol. 11, p. 759007, 2021.
  11. Z. Zheng, A. Ma, L. Zhang, and Y. Zhong, “Deep multisensor learning for missing-modality all-weather mapping,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 174, pp. 254–264, 2021.
  12. N. C. Garcia, P. Morerio, and V. Murino, “Modality distillation with multiple stream networks for action recognition,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 103–118.
  13. X. Xing, Z. Chen, M. Zhu, Y. Hou, Z. Gao, and Y. Yuan, “Discrepancy and gradient-guided multi-modal knowledge distillation for pathological glioma grading,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2022, pp. 636–646.
  14. Y. Zhang, J. Yang, J. Tian, Z. Shi, C. Zhong, Y. Zhang, and Z. He, “Modality-aware mutual learning for multi-modal medical image segmentation,” in Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24.   Springer, 2021, pp. 589–599.
  15. A. Rahate, R. Walambe, S. Ramanna, and K. Kotecha, “Multimodal co-learning: challenges, applications with datasets, recent advances and future directions,” Information Fusion, vol. 81, pp. 203–239, 2022.
  16. M. Ma, J. Ren, L. Zhao, S. Tulyakov, C. Wu, and X. Peng, “Smil: Multimodal learning with severely missing modality,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 3, 2021, pp. 2302–2310.
  17. X. Wang, S. Price, and C. Li, “Multi-task learning of histology and molecular markers for classifying diffuse glioma,” arXiv preprint arXiv:2303.14845, 2023.
  18. J. Höhn, E. Krieghoff-Henning, T. B. Jutzi, C. von Kalle, J. S. Utikal, F. Meier, F. F. Gellrich, S. Hobelsberger, A. Hauschild, J. G. Schlager et al., “Combining cnn-based histologic whole slide image analysis and patient data to improve skin cancer classification,” European Journal of Cancer, vol. 149, pp. 94–101, 2021.
  19. J. Yang, J. Ju, L. Guo, B. Ji, S. Shi, Z. Yang, S. Gao, X. Yuan, G. Tian, Y. Liang et al., “Prediction of her2-positive breast cancer recurrence and metastasis risk from histopathological images and clinical information via multimodal deep learning,” Computational and structural biotechnology journal, vol. 20, pp. 333–342, 2022.
  20. K. Huang, B. Lin, J. Liu, Y. Liu, J. Li, G. Tian, and J. Yang, “Predicting colorectal cancer tumor mutational burden from histopathological images and clinical information using multi-modal deep learning,” Bioinformatics, vol. 38, no. 22, pp. 5108–5115, 2022.
  21. A. Zadeh, M. Chen, S. Poria, E. Cambria, and L.-P. Morency, “Tensor fusion network for multimodal sentiment analysis,” arXiv preprint arXiv:1707.07250, 2017.
  22. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  23. X. Chen, N. Zhang, X. Xie, S. Deng, Y. Yao, C. Tan, F. Huang, L. Si, and H. Chen, “Knowprompt: Knowledge-aware prompt-tuning with synergistic optimization for relation extraction,” in Proceedings of the ACM Web Conference 2022, 2022, pp. 2778–2788.
  24. L. Beyer, X. Zhai, A. Royer, L. Markeeva, R. Anil, and A. Kolesnikov, “Knowledge distillation: A good teacher is patient and consistent,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 925–10 934.
  25. S. H. Dumpala, I. Sheikh, R. Chakraborty, and S. K. Kopparapu, “Audio-visual fusion for sentiment classification using cross-modal autoencoder,” in 32nd conference on neural information processing systems (NIPS 2018), 2019, pp. 1–4.
  26. S. Zhang, Z. Tang, H. Pan, X. Wei, and J. Huang, “A hierarchical framwork with improved loss for large-scale multi-modal video identification,” in Proceedings of the 27th ACM International Conference on Multimedia, 2019, pp. 2539–2542.
  27. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Shichuan Zhang (24 papers)
  2. Sunyi Zheng (22 papers)
  3. Zhongyi Shui (22 papers)
  4. Honglin Li (32 papers)
  5. Lin Yang (212 papers)
Citations (4)