Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SurvMamba: State Space Model with Multi-grained Multi-modal Interaction for Survival Prediction (2404.08027v2)

Published 11 Apr 2024 in cs.CV, cs.AI, cs.LG, and q-bio.QM

Abstract: Multi-modal learning that combines pathological images with genomic data has significantly enhanced the accuracy of survival prediction. Nevertheless, existing methods have not fully utilized the inherent hierarchical structure within both whole slide images (WSIs) and transcriptomic data, from which better intra-modal representations and inter-modal integration could be derived. Moreover, many existing studies attempt to improve multi-modal representations through attention mechanisms, which inevitably lead to high complexity when processing high-dimensional WSIs and transcriptomic data. Recently, a structured state space model named Mamba emerged as a promising approach for its superior performance in modeling long sequences with low complexity. In this study, we propose Mamba with multi-grained multi-modal interaction (SurvMamba) for survival prediction. SurvMamba is implemented with a Hierarchical Interaction Mamba (HIM) module that facilitates efficient intra-modal interactions at different granularities, thereby capturing more detailed local features as well as rich global representations. In addition, an Interaction Fusion Mamba (IFM) module is used for cascaded inter-modal interactive fusion, yielding more comprehensive features for survival prediction. Comprehensive evaluations on five TCGA datasets demonstrate that SurvMamba outperforms other existing methods in terms of performance and computational cost.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Scaling vision transformers to gigapixel images via hierarchical self-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16144–16155.
  2. Whole slide images are 2d point clouds: Context-aware survival prediction using patch-based graph convolutional networks. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part VIII 24. Springer, 339–349.
  3. Pathomic fusion: an integrated framework for fusing histopathology and genomic features for cancer diagnosis and prognosis. IEEE Transactions on Medical Imaging 41, 4 (2020), 757–770.
  4. Multimodal co-attention transformer for survival prediction in gigapixel whole slide images. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4015–4025.
  5. Pan-cancer integrative histology-genomic analysis via multimodal deep learning. Cancer Cell 40, 8 (2022), 865–878.
  6. G Kleinbaum David and Klein Mitchel. 2012. Survival analysis: a Self-Learning text.
  7. Hungry hungry hippos: Towards language modeling with state space models. arXiv preprint arXiv:2212.14052 (2022).
  8. Pan-cancer computational histopathology reveals mutations, tumor composition and prognosis. Nature cancer 1, 8 (2020), 800–810.
  9. Albert Gu and Tri Dao. 2023. Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023).
  10. Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396 (2021).
  11. Multimodal recurrence scoring system for prediction of clear cell renal cell carcinoma outcome: a discovery and validation study. The Lancet Digital Health 5, 8 (2023), e515–e524.
  12. GeneWalk identifies relevant gene functions for a biological context using network representation learning. Genome biology 22 (2021), 1–35.
  13. Attention-based deep multiple instance learning. In International conference on machine learning. PMLR, 2127–2136.
  14. Modeling dense multimodal interactions between biological pathways and histology for survival prediction. arXiv preprint arXiv:2304.06819 (2023).
  15. Minoru Kanehisa and Susumu Goto. 2000. KEGG: kyoto encyclopedia of genes and genomes. Nucleic acids research 28, 1 (2000), 27–30.
  16. Mine local homogeneous representation by interaction information clustering with unsupervised learning in histopathology images. Computer Methods and Programs in Biomedicine 235 (2023), 107520.
  17. Self-normalizing neural networks. Advances in neural information processing systems 30 (2017).
  18. David G Kleinbaum and Mitchel Klein. 1996. Survival analysis a self-learning text. Springer.
  19. Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14318–14328.
  20. Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction. arXiv preprint arXiv:2402.19326 (2024).
  21. HFBSurv: hierarchical multimodal fusion with factorized bilinear models for cancer survival prediction. Bioinformatics 38, 9 (2022), 2587–2594.
  22. Data-efficient and weakly supervised computational pathology on whole-slide images. Nature biomedical engineering 5, 6 (2021), 555–570.
  23. Long range language modeling via gated state spaces. arXiv preprint arXiv:2206.13947 (2022).
  24. Cancer prognosis with shallow tumor RNA sequencing. Nature medicine 26, 2 (2020), 188–192.
  25. Deep biological pathway informed pathology-genomic multimodal survival prediction. arXiv preprint arXiv:2301.02383 (2023).
  26. Integrative analysis of pathological images and multi-dimensional genomic data for early-stage cancer prognosis. IEEE transactions on medical imaging 39, 1 (2019), 99–110.
  27. Transmil: Transformer based correlated multiple instance learning for whole slide image classification. Advances in neural information processing systems 34 (2021), 2136–2147.
  28. Simplified state space layers for sequence modeling. arXiv preprint arXiv:2208.04933 (2022).
  29. Ongoing shuffling of protein fragments diversifies core viral functions linked to interactions with bacterial hosts. Nature Communications 14, 1 (2023), 7460.
  30. Multimodal fusion using sparse CCA for breast cancer survival prediction. In 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI). IEEE, 1429–1432.
  31. Prioritizing prognostic-associated subpopulations and individualized recurrence risk signatures from single-cell transcriptomes of colorectal cancer. Briefings in Bioinformatics 24, 3 (2023), bbad078.
  32. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Medicine 13 (2021), 1–17.
  33. Transpath: Transformer-based self-supervised learning for histopathological image classification. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part VIII 24. Springer, 186–195.
  34. Yingxue Xu and Hao Chen. 2023. Multimodal optimal transport-based co-attention transformer with global structure consistency for survival prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 21241–21251.
  35. Shekoufeh Gorgi Zadeh and Matthias Schmid. 2020. Bias in cross-entropy-based training of deep survival networks. IEEE transactions on pattern analysis and machine intelligence 43, 9 (2020), 3126–3137.
  36. Inferring super-resolution tissue architecture by integrating spatial transcriptomics with histology. Nature Biotechnology (2024), 1–6.
  37. Protocol for HSDFinder: Identifying, annotating, categorizing, and visualizing duplicated genes in eukaryotic genomes. STAR protocols 2, 3 (2021), 100619.
  38. Prototypical Information Bottlenecking and Disentangling for Multimodal Cancer Survival Prediction. arXiv preprint arXiv:2401.01646 (2024).
  39. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv preprint arXiv:2401.09417 (2024).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Ying Chen (333 papers)
  2. Jiajing Xie (1 paper)
  3. Yuxiang Lin (7 papers)
  4. Yuhang Song (36 papers)
  5. Wenxian Yang (10 papers)
  6. Rongshan Yu (6 papers)
Citations (5)