Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Complementary Information Mutual Learning for Multimodality Medical Image Segmentation (2401.02717v2)

Published 5 Jan 2024 in cs.CV and cs.AI

Abstract: Radiologists must utilize multiple modal images for tumor segmentation and diagnosis due to the limitations of medical imaging and the diversity of tumor signals. This leads to the development of multimodal learning in segmentation. However, the redundancy among modalities creates challenges for existing subtraction-based joint learning methods, such as misjudging the importance of modalities, ignoring specific modal information, and increasing cognitive load. These thorny issues ultimately decrease segmentation accuracy and increase the risk of overfitting. This paper presents the complementary information mutual learning (CIML) framework, which can mathematically model and address the negative impact of inter-modal redundant information. CIML adopts the idea of addition and removes inter-modal redundant information through inductive bias-driven task decomposition and message passing-based redundancy filtering. CIML first decomposes the multimodal segmentation task into multiple subtasks based on expert prior knowledge, minimizing the information dependence between modalities. Furthermore, CIML introduces a scheme in which each modality can extract information from other modalities additively through message passing. To achieve non-redundancy of extracted information, the redundant filtering is transformed into complementary information learning inspired by the variational information bottleneck. The complementary information learning procedure can be efficiently solved by variational inference and cross-modal spatial attention. Numerical results from the verification task and standard benchmarks indicate that CIML efficiently removes redundant information between modalities, outperforming SOTA methods regarding validation accuracy and segmentation effect.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (67)
  1. Deep variational information bottleneck. In ICLR, 2017.
  2. Multimodal machine learning: A survey and taxonomy. IEEE transactions on Pattern Analysis and Machine Intelligence, 41(2):423–443, 2018.
  3. A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets. The Visual Computer, pages 1–32, 2021.
  4. Cross-modality synthesis from ct to pet using fcn and gan networks for improved automated lesion detection. Engineering Applications of Artificial Intelligence, 78:186–194, 2019.
  5. Multicenter evaluation of ai-generated dir and psir for cortical and juxtacortical multiple sclerosis lesion detection. Radiology, page 221425, 2023.
  6. Modality effects on cognitive load and performance in high-load information presentation. In IUI, 2009.
  7. Functional relevance of cross-modal plasticity in blind humans. Nature, 389(6647):180–183, 1997.
  8. Lee R Dice. Measures of the amount of ecologic association between species. Ecology, 26(3):297–302, 1945.
  9. Rfnet: Region-aware fusion network for incomplete multi-modal brain tumor segmentation. In ICCV, 2021.
  10. Hyperdense-net: a hyper-densely connected cnn for multi-modal image segmentation. IEEE Transactions on Medical Imaging, 38(5):1116–1126, 2018.
  11. The importance of skip connections in biomedical image segmentation. In DLMIA, 2016.
  12. Learning robust representations via multi-view information bottleneck. arXiv preprint arXiv:2002.07017, 2020.
  13. A whole-body fdg-pet/ct dataset with manually annotated tumor lesions. Scientific Data, 9(1):1–7, 2022.
  14. Cognitive neuroscience. the biology of the mind,(2014), 2006.
  15. 2022 guideline for the management of patients with spontaneous intracerebral hemorrhage: a guideline from the american heart association/american stroke association. Stroke, 53(7):e282–e361, 2022.
  16. Radiogenomic association between the t2-flair mismatch sign and idh mutation status in adult patients with lower-grade gliomas: An updated systematic review and meta-analysis. European Radiology, 32(8):5339–5352, 2022.
  17. Unetr: Transformers for 3d medical image segmentation. In WACV, 2022.
  18. Jeff Henrikson. Completeness and total boundedness of the hausdorff metric. MIT Undergraduate Journal of Mathematics, 1(69-80):10, 1999.
  19. Medical image registration. Physics in Medicine & Biology, 46(3):R1, 2001.
  20. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2(7), 2015.
  21. Geoffrey E Hinton. Training products of experts by minimizing contrastive divergence. Neural Computation, 14(8):1771–1800, 2002.
  22. Multi-modal policy fusion for end-to-end autonomous driving. Information Fusion, 98:101834, 2023.
  23. NNU-Net: Self-adapting framework for U-net-based medical image segmentation. ArXiv preprint, abs/1809.10486, 2018. URL https://arxiv.org/abs/1809.10486.
  24. Modality and redundancy effects, and their relation to executive functioning in children with dyslexia. Research in Developmental Disabilities, 90:41–50, 2019.
  25. Changhee Lee and Mihaela van der Schaar. A variational information bottleneck approach to multi-omics data integration. In AISTATS, 2021.
  26. Digest: Deeply supervised knowledge transfer network learning for brain tumor segmentation with incomplete multi-modal mri scans. arXiv preprint arXiv:2211.07993, 2022.
  27. Adversarial multimodal representation learning for click-through rate prediction. In WWW, 2020.
  28. Xuelong Li. Multi-modal cognitive computing. SCIENTIA SINICA Informationis, 53(1):1–32, 2023.
  29. Visual music and musical vision. Neurocomputing, 71(10-12):2023–2028, 2008.
  30. 3d brain tumor segmentation using a two-stage optimal mass transport algorithm. Scientific Reports, 11(1):14686, 2021.
  31. Detexnet: accurately diagnosing frequent and challenging pediatric malignant tumors. IEEE Transactions on Medical Imaging, 40(1):395–404, 2020.
  32. Multimodal information bottleneck: Learning minimal sufficient unimodal and multimodal representations. IEEE Transactions on Multimedia, 2022.
  33. Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38(1):43–52, 2003.
  34. A proposal for the dartmouth summer research project on artificial intelligence, august 31, 1955. AI Magazine, 27(4):12–12, 2006.
  35. The multimodal brain tumor image segmentation benchmark (brats). IEEE Transactions on Medical Imaging, 34(10):1993–2024, 2014.
  36. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In 3DV, 2016.
  37. Attention u-net: Learning where to look for the pancreas. ArXiv preprint, abs/1804.03999, 2018. URL https://arxiv.org/abs/1804.03999.
  38. Head and neck tumor segmentation in pet/ct: the hecktor challenge. Medical Image Analysis, 77:102336, 2022.
  39. Zero-shot text-to-image generation. In ICML, 2021.
  40. A generalist agent. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=1ikK0kHjvj. Featured Certification.
  41. High-resolution image synthesis with latent diffusion models. In CVPR, 2022.
  42. Grad-cam: Visual explanations from deep networks via gradient-based localization. In ICCV, 2017.
  43. Claude Elwood Shannon. A mathematical theory of communication. ACM SIGMOBILE mobile computing and communications review, 5(1):3–55, 2001.
  44. Deep learning and the information bottleneck principle. In ITW, 2015.
  45. Contrastive learning, multi-view redundancy, and linear models. In Algorithmic Learning Theory, pages 1179–1206. PMLR, 2021.
  46. Why does synthesized data improve multi-sequence classification? In MICCAI, 2015.
  47. Towards interpretable semantic segmentation via gradient-weighted class activation mapping (student abstract). In AAAI, 2020.
  48. Amsa: Adaptive multimodal learning for sentiment analysis. ACM Transactions on Multimedia Computing, Communications and Applications, 19(3s), 2022.
  49. Deep multi-view information bottleneck. In ICDM, 2019.
  50. Cbam: Convolutional block attention module. In ECCV, 2018.
  51. Multimodal generative models for scalable weakly-supervised learning. In NeurIPS, 2018.
  52. Visual question answering: A survey of methods and datasets. Computer Vision and Image Understanding, 163:21–40, 2017.
  53. Nestedformer: Nested modality-aware transformer for brain tumor segmentation. In MICCAI, 2022.
  54. Show, attend and tell: Neural image caption generation with visual attention. In ICML, 2015.
  55. Multimodal learning with transformers: A survey. arXiv preprint arXiv:2206.06488, 2022.
  56. Integration of acoustic and visual speech signals using neural networks. IEEE Communications Magazine, 27(11):65–71, 1989.
  57. Neural activities in v1 create a bottom-up saliency map. Neuron, 73(1):183–192, 2012.
  58. Modality-aware mutual learning for multi-modal medical image segmentation. In MICCAI, 2021a.
  59. Brain tumor segmentation from multi-modal mr images via ensembling unets. Frontiers in Radiology, page 11, 2021b.
  60. Co-learning non-negative correlated and uncorrelated features for multi-view data. IEEE Transactions on Neural Networks and Learning Systems, 32(4):1486–1496, 2020.
  61. Learning deep features for discriminative localization. In CVPR, 2016.
  62. One-pass multi-task networks with cross-task guided attention for brain tumor segmentation. IEEE Transactions on Image Processing, 29:4516–4529, 2020a.
  63. Hi-net: hybrid-fusion network for multi-modal mr image synthesis. IEEE Transactions on Medical Imaging, 39(9):2772–2781, 2020b.
  64. A review: Deep learning for medical image segmentation using multi-modality fusion. Array, 3:100004, 2019.
  65. A tri-attention fusion guided multi-modal segmentation network. Pattern Recognition, 124:108417, 2022.
  66. Predicting the popularity of micro-videos with multimodal variational encoder-decoder framework. arXiv preprint arXiv:2003.12724, 2020.
  67. A 3d cross-modality feature interaction network with volumetric feature alignment for brain tumor and tissue segmentation. IEEE Journal of Biomedical and Health Informatics, 2022.
Citations (1)

Summary

We haven't generated a summary for this paper yet.