Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HyperMM : Robust Multimodal Learning with Varying-sized Inputs (2407.20768v1)

Published 30 Jul 2024 in cs.LG

Abstract: Combining multiple modalities carrying complementary information through multimodal learning (MML) has shown considerable benefits for diagnosing multiple pathologies. However, the robustness of multimodal models to missing modalities is often overlooked. Most works assume modality completeness in the input data, while in clinical practice, it is common to have incomplete modalities. Existing solutions that address this issue rely on modality imputation strategies before using supervised learning models. These strategies, however, are complex, computationally costly and can strongly impact subsequent prediction models. Hence, they should be used with parsimony in sensitive applications such as healthcare. We propose HyperMM, an end-to-end framework designed for learning with varying-sized inputs. Specifically, we focus on the task of supervised MML with missing imaging modalities without using imputation before training. We introduce a novel strategy for training a universal feature extractor using a conditional hypernetwork, and propose a permutation-invariant neural network that can handle inputs of varying dimensions to process the extracted features, in a two-phase task-agnostic framework. We experimentally demonstrate the advantages of our method in two tasks: Alzheimer's disease detection and breast cancer classification. We demonstrate that our strategy is robust to high rates of missing data and that its flexibility allows it to handle varying-sized datasets beyond the scenario of missing modalities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence, 41(2):423–443, 2018.
  2. Deep adversarial learning for multi-modality missing data completion. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1158–1166, 2018.
  3. A unified model for longitudinal multi-modal multi-view prediction with missingness. arXiv preprint arXiv:2403.12211, 2024.
  4. Multimodal spatial attention module for targeting multimodal pet-ct lung tumor segmentation. IEEE Journal of Biomedical and Health Informatics, 25(9):3507–3516, 2021.
  5. Hypernetworks. CoRR, 2016.
  6. Diagnosis of alzheimer’s disease via multi-modality 3d convolutional neural network. Frontiers in neuroscience, 13:509, 2019.
  7. Automated brain extraction of multisequence mri using artificial neural networks. Human brain mapping, 40(17):4952–4964, 2019.
  8. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1125–1134, 2017.
  9. Training generative adversarial networks with limited data. Advances in neural information processing systems, 33:12104–12114, 2020.
  10. J.-C. Kim and K. Chung. Multi-modal stacked denoising autoencoder for handling missing data in healthcare big data. IEEE Access, 8:104933–104943, 2020.
  11. What’s a good imputation to predict with missing values? Advances in Neural Information Processing Systems, 34:11530–11540, 2021.
  12. Alzheimer’s disease classification using 2d convolutional neural networks. In 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 3008–3012, 2021.
  13. Z. Lu. A theory of multimodal learning. Advances in Neural Information Processing Systems, 36, 2024.
  14. Curriculum incremental deep learning on breakhis dataset. In Proceedings of the 2022 8th International Conference on Computer Technology Applications, pages 35–41, 2022.
  15. Ways toward an early diagnosis in alzheimer’s disease: the alzheimer’s disease neuroimaging initiative (adni). Alzheimer’s & Dementia, 1(1):55–66, 2005.
  16. Machine learning with multimodal neuroimaging data to classify stages of alzheimer’s disease: a systematic review and meta-analysis. Cognitive Neurodynamics, pages 1–20, 2023.
  17. S. Parthasarathy and S. Sundaram. Training strategies to handle missing modalities for audio-visual expression recognition. In Companion Publication of the 2020 International Conference on Multimodal Interaction, pages 400–404, 2020.
  18. The impact of imputation quality on machine learning classifiers for datasets with missing values. Communications Medicine, 3(1):139, 2023.
  19. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  20. A dataset for breast cancer histopathological image classification. Ieee transactions on biomedical engineering, 63(7):1455–1462, 2015.
  21. Semi-supervised multimodal image translation for missing modality imputation. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4320–4324, 2021.
  22. A scoping review on multimodal deep learning in biomedical images and texts. Journal of Biomedical Informatics, page 104482, 2023.
  23. Metric learning on healthcare data with incomplete modalities. In IJCAI, volume 3534, page 3540, 2019.
  24. Multimodal imaging in alzheimer’s disease: validity and usefulness for early detection. The Lancet Neurology, 14(10):1037–1053, 2015.
  25. Multimodal imaging for improved diagnosis and treatment of cancers. Cancer, 121(6):817–827, 2015.
  26. Multimodal deep learning models for early detection of alzheimer’s disease stage. Scientific reports, 11(1):3254, 2021.
  27. Multi-modal learning with missing modality via shared-specific feature modelling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15878–15887, 2023.
  28. M. Wu and N. Goodman. Multimodal generative models for scalable weakly-supervised learning. Advances in neural information processing systems, 31, 2018.
  29. Multimodal learning with transformers: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
  30. Deep sets. Advances in neural information processing systems, 30, 2017.
  31. T. Zhang and M. Shi. Multi-modal neuroimaging feature fusion for diagnosis of alzheimer’s disease. Journal of Neuroscience Methods, 341:108795, 2020.
  32. Unified multi-modal image synthesis for missing modality imputation. arXiv preprint arXiv:2304.05340, 2023.
  33. Incomplete multimodal learning for visual acuity prediction after cataract surgery using masked self-attention. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 735–744. Springer, 2023.
  34. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pages 2223–2232, 2017.
  35. Multimodal representations learning and adversarial hypergraph fusion for early alzheimer’s disease prediction. In Pattern Recognition and Computer Vision: 4th Chinese Conference, PRCV 2021, Beijing, China, October 29–November 1, 2021, Proceedings, Part III 4, pages 479–490. Springer, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Hava Chaptoukaev (2 papers)
  2. Vincenzo Marcianó (1 paper)
  3. Francesco Galati (9 papers)
  4. Maria A. Zuluaga (31 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets