Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dynamic Multimodal Information Bottleneck for Multimodality Classification (2311.01066v3)

Published 2 Nov 2023 in eess.IV and cs.CV

Abstract: Effectively leveraging multimodal data such as various images, laboratory tests and clinical information is gaining traction in a variety of AI-based medical diagnosis and prognosis tasks. Most existing multi-modal techniques only focus on enhancing their performance by leveraging the differences or shared features from various modalities and fusing feature across different modalities. These approaches are generally not optimal for clinical settings, which pose the additional challenges of limited training data, as well as being rife with redundant data or noisy modality channels, leading to subpar performance. To address this gap, we study the robustness of existing methods to data redundancy and noise and propose a generalized dynamic multimodal information bottleneck framework for attaining a robust fused feature representation. Specifically, our information bottleneck module serves to filter out the task-irrelevant information and noises in the fused feature, and we further introduce a sufficiency loss to prevent dropping of task-relevant information, thus explicitly preserving the sufficiency of prediction information in the distilled feature. We validate our model on an in-house and a public COVID19 dataset for mortality prediction as well as two public biomedical datasets for diagnostic tasks. Extensive experiments show that our method surpasses the state-of-the-art and is significantly more robust, being the only method to remain performance when large-scale noisy channels exist. Our code is publicly available at https://github.com/ayanglab/DMIB.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Multimodal biomedical ai. Nature Medicine, 28(9):1773–1784, 2022.
  2. Francis Bach. Breaking the curse of dimensionality with convex neural networks. The Journal of Machine Learning Research, 18(1):629–681, 2017.
  3. An overview of deep learning methods for multimodal medical data mining. Expert Systems with Applications, page 117006, 2022.
  4. Mine: mutual information neural estimation. arXiv preprint arXiv:1801.04062, 2018.
  5. Variational information bottleneck for effective low-resource fine-tuning. In International Conference on Learning Representations, 2020.
  6. A multimodal transformer to fuse images and metadata for skin disease classification. The Visual Computer, pages 1–13, 2022.
  7. A survey on multimodal data-driven smart healthcare systems: approaches and applications. IEEE Access, 7:133583–133599, 2019.
  8. Using deepgcn to identify the autism spectrum disorder from multi-site resting-state data. Biomedical Signal Processing and Control, 70:103015, 2021.
  9. Deep multi-modal fusion of image and non-image data in disease diagnosis and prognosis: A review. arXiv preprint arXiv:2203.15588, 2022.
  10. Prediction of pathological complete response to neoadjuvant chemotherapy in breast cancer using deep learning with integrative imaging, molecular and demographic data. In International conference on medical image computing and computer-assisted intervention, pages 242–252. Springer, 2020.
  11. Tyrone E Duncan. On the calculation of mutual information. SIAM Journal on Applied Mathematics, 19(1):215–220, 1970.
  12. Learning robust representations via multi-view information bottleneck. In 8th International Conference on Learning Representations. OpenReview. net, 2020.
  13. Gromov-wasserstein multi-modal alignment and clustering. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 603–613, 2022.
  14. Multimodal dynamics: Dynamical fusion for trustworthy multimodal classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20707–20717, 2022.
  15. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16000–16009, 2022.
  16. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  17. Learning deep representations by mutual information estimation and maximization. In International Conference on Learning Representations, 2018.
  18. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017.
  19. Indication as prior knowledge for multimodal disease classification in chest radiographs with transformers. In 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), pages 1–5. IEEE, 2022.
  20. U-gat: Multimodal graph attention network for covid-19 outcome prediction. arXiv preprint arXiv:2108.00860, 2021.
  21. Foundations and recent trends in multimodal machine learning: Principles, challenges, and open questions. arXiv preprint arXiv:2209.03430, 2022.
  22. Ralph Linsker. Self-organization in a perceptual network. Computer, 21(3):105–117, 1988.
  23. Temporal feature alignment and mutual information maximization for video-based human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11006–11016, 2022.
  24. A deep learning prognosis model help alert for covid-19 patients at high-risk of death: a multi-center study. IEEE journal of biomedical and health informatics, 24(12):3576–3584, 2020.
  25. Fusatnet: Dual attention based spectrospatial multimodal fusion network for hyperspectral and lidar classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 92–93, 2020.
  26. A comprehensive survey on multimodal medical signals fusion for smart healthcare systems. Information Fusion, 76:355–375, 2021.
  27. Estimating divergence functionals and the likelihood ratio by convex risk minimization. IEEE Transactions on Information Theory, 56(11):5847–5861, 2010.
  28. Open resource of clinical data from patients with pneumonia for the prediction of covid-19 outcomes via deep learning. Nature biomedical engineering, 4(12):1197–1207, 2020.
  29. Wasserstein dependency measure for representation learning. Advances in Neural Information Processing Systems, 32, 2019.
  30. Combining 3d image and tabular data via the dynamic affine feature map transform. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 688–698. Springer, 2021.
  31. On variational bounds of mutual information. In International Conference on Machine Learning, pages 5171–5180. PMLR, 2019.
  32. Scalable and accurate deep learning with electronic health records. NPJ digital medicine, 1(1):1–10, 2018.
  33. An artificial intelligence system for predicting the deterioration of covid-19 patients in the emergency department. NPJ digital medicine, 4(1):1–11, 2021.
  34. Deep learning in medical image analysis. Annual review of biomedical engineering, 19:221, 2017.
  35. Efficientnet: Rethinking model scaling for convolutional neural networks. In International conference on machine learning, pages 6105–6114. PMLR, 2019.
  36. Farewell to mutual information: Variational distillation for cross-modal person re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1522–1531, 2021.
  37. The information bottleneck method. arXiv preprint physics/0004057, 2000.
  38. Deep learning and the information bottleneck principle. In 2015 ieee information theory workshop (itw), pages 1–5. IEEE, 2015.
  39. Variational knowledge distillation for disease classification in chest x-rays. In Information Processing in Medical Imaging: 27th International Conference, IPMI 2021, Virtual Event, June 28–June 30, 2021, Proceedings 27, pages 334–345. Springer, 2021.
  40. The role of the information bottleneck in representation learning. In 2018 IEEE International Symposium on Information Theory (ISIT), pages 1580–1584. IEEE, 2018.
  41. Multi-view information-bottleneck representation learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 10085–10092, 2021.
  42. Moronet: multi-omics integration via graph convolutional networks for biomedical data classification. bioRxiv, pages 2020–07, 2020.
  43. Multimodal learning with transformers: A survey. arXiv preprint arXiv:2206.06488, 2022.
  44. Deep correlational learning for survival prediction from multi-modality data. In Medical Image Computing and Computer-Assisted Intervention- MICCAI 2017: 20th International Conference, Quebec City, QC, Canada, September 11-13, 2017, Proceedings, Part II, pages 406–414. Springer, 2017.
  45. Multimodal skin lesion classification using deep learning. Experimental dermatology, 27(11):1261–1267, 2018.
  46. Artificial intelligence in healthcare. Nature biomedical engineering, 2(10):719–731, 2018.
  47. Metaformer is actually what you need for vision. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10819–10829, 2022.
  48. Deep learning for cancer type classification and driver gene identification. BMC bioinformatics, 22(4):1–13, 2021.
  49. Multi-modal graph learning for disease prediction. IEEE Transactions on Medical Imaging, 41(9):2207–2216, 2022.
  50. Cohesive multi-modality feature learning and fusion for covid-19 patient severity prediction. IEEE Transactions on Circuits and Systems for Video Technology, 32(5):2535–2549, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yingying Fang (20 papers)
  2. Shuang Wu (99 papers)
  3. Sheng Zhang (212 papers)
  4. Chaoyan Huang (7 papers)
  5. Tieyong Zeng (71 papers)
  6. Xiaodan Xing (35 papers)
  7. Simon Walsh (16 papers)
  8. Guang Yang (422 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.