Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
184 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling (2405.14022v5)

Published 22 May 2024 in eess.IV and cs.CV

Abstract: Multi-modal medical image synthesis involves nonlinear transformation of tissue signals between source and target modalities, where tissues exhibit contextual interactions across diverse spatial distances. As such, the utility of a network architecture in synthesis depends on its ability to express these contextual features. Convolutional neural networks (CNNs) offer high local precision at the expense of poor sensitivity to long-range context. While transformers promise to alleviate this issue, they suffer from an unfavorable trade-off between sensitivity to long- versus short-range context due to the intrinsic complexity of attention filters. To effectively capture contextual features while avoiding the complexity-driven trade-offs, here we introduce a novel multi-modal synthesis method, I2I-Mamba, based on the state space modeling (SSM) framework. Focusing on semantic representations across a hybrid residual architecture, I2I-Mamba leverages novel dual-domain Mamba (ddMamba) blocks for complementary contextual modeling in image and Fourier domains, while maintaining spatial precision with convolutional layers. Diverting from conventional raster-scan trajectories, ddMamba leverages novel SSM operators based on a spiral-scan trajectory to learn context with enhanced radial coverage and angular isotropy, and a channel-mixing layer to aggregate context across the channel dimension. Comprehensive demonstrations on multi-contrast MRI and MRI-CT protocols indicate that I2I-Mamba offers superior performance against state-of-the-art CNNs, transformers and SSMs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (87)
  1. B. Moraal, S. Roosendaal, P. Pouwels, H. Vrenken, R. Schijndel, D. Meier, C. Guttmann, J. Geurts, and F. Barkhof, “Multi-contrast, isotropic, single-slab 3d MR imaging in multiple sclerosis,” Eur. Radiol., vol. 18, pp. 2311–2320, 2008.
  2. B. Thukral, “Problems and preferences in pediatric imaging,” Indian J. Radiol. Imaging, vol. 25, pp. 359–364, 2015.
  3. K. Krupa and M. Bekiesińska-Figatowska, “Artifacts in magnetic resonance imaging,” Pol. J. Radiol., vol. 80, pp. 93–106, 2015.
  4. J. E. Iglesias, E. Konukoglu, D. Zikic, B. Glocker, K. Van Leemput, and B. Fischl, “Is synthesizing MRI contrast useful for inter-modality analysis?” in Med. Image Comput. Comput. Assist. Interv., 2013, pp. 631–638.
  5. Y. Huo, Z. Xu, S. Bao, A. Assad, R. G. Abramson, and B. A. Landman, “Adversarial synthesis learning enables segmentation without target modality ground truth,” in Int. Symp. Biomed. Imaging, 2018, pp. 1217–1220.
  6. D. H. Ye, D. Zikic, B. Glocker, A. Criminisi, and E. Konukoglu, “Modality propagation: Coherent synthesis of subject-specific scans with data-driven regularization,” in Med. Image Comput. Comput. Assist. Interv., 2013, pp. 606–613.
  7. C. Catana, A. van der Kouwe, T. Benner, C. J. Michel, M. Hamm, M. Fenchel, B. Fischl, B. Rosen, M. Schmand, and A. G. Sorensen, “Toward implementing an MRI-based PET attenuation-correction method for neurologic studies on the MR-PET brain prototype,” J. Nucl. Med., vol. 51, no. 9, pp. 1431–1438, 2010.
  8. S. Roy, A. Jog, A. Carass, and J. L. Prince, “Atlas based intensity transformation of brain MR images,” in Multimodal Brain Image Anal., 2013, pp. 51–62.
  9. J. Lee, A. Carass, A. Jog, C. Zhao, and J. Prince, “Multi-atlas-based CT synthesis from conventional MRI with patch-based refinement for MRI-based radiotherapy planning,” in SPIE Med. Imag., vol. 10133, 2017, p. 101331I.
  10. Y. Huang, L. Shao, and A. F. Frangi, “Simultaneous super-resolution and cross-modality synthesis of 3D medical images using weakly-supervised joint convolutional sparse coding,” Comput. Vis. Pattern Recognit., pp. 5787–5796, 2017.
  11. ——, “Cross-modality image synthesis via weakly coupled and geometry co-regularized joint dictionary learning,” IEEE Trans. Med. Imag., vol. 37, no. 3, pp. 815–827, 2018.
  12. C. Zhao, A. Carass, J. Lee, Y. He, and J. L. Prince, “Whole brain segmentation and labeling from CT using synthetic MR images,” in Mach. Learn. Med. Imag., 2017, pp. 291–298.
  13. A. Jog, A. Carass, S. Roy, D. L. Pham, and J. L. Prince, “Random forest regression for magnetic resonance image synthesis,” Med. Image. Anal., vol. 35, pp. 475–488, 2017.
  14. H. Van Nguyen, K. Zhou, and R. Vemulapalli, “Cross-domain synthesis of medical images using efficient location-sensitive deep network,” in Med. Image Comput. Comput. Assist. Interv., 2015, pp. 677–684.
  15. R. Vemulapalli, H. Van Nguyen, and S. K. Zhou, “Unsupervised cross-modal synthesis of subject-specific scans,” in Int. Conf. Comput. Vis., 2015, pp. 630–638.
  16. Y. Wu, W. Yang, L. Lu, Z. Lu, L. Zhong, M. Huang, Y. Feng, Q. Feng, and W. Chen, “Prediction of CT substitutes from MR images based on local diffeomorphic mapping for brain PET attenuation correction,” J. Nucl. Med., vol. 57, no. 10, pp. 1635–1641, 2016.
  17. D. C. Alexander, D. Zikic, J. Zhang, H. Zhang, and A. Criminisi, “Image quality transfer via random forest regression: Applications in diffusion MRI,” in Med. Image Comput. Comput. Assist. Interv., 2014, pp. 225–232.
  18. T. Huynh, Y. Gao, J. Kang, L. Wang, P. Zhang, J. Lian, and D. Shen, “Estimating CT image from MRI data using structured random forest and auto-context model,” IEEE Trans. Med. Imag., vol. 35, no. 1, pp. 174–183, 2016.
  19. P. Coupe, J. V. Manjón, M. Chamberland, M. Descoteaux, and B. Hiba, “Collaborative patch-based super-resolution for diffusion-weighted images,” NeuroImage, vol. 83, pp. 245–261, 2013.
  20. V. Sevetlidis, M. V. Giuffrida, and S. A. Tsaftaris, “Whole image synthesis using a deep encoder-decoder network,” in Simul. Synth. Med. Imaging, 2016, pp. 127–137.
  21. A. Chartsias, T. Joyce, M. V. Giuffrida, and S. A. Tsaftaris, “Multimodal MR synthesis via modality-invariant latent representation,” IEEE Trans. Med. Imag., vol. 37, no. 3, pp. 803–814, 2018.
  22. C. Bowles, C. Qin, C. Ledig, R. Guerrero, R. Gunn, A. Hammers, E. Sakka, D. Dickie, M. Hernández, N. Royle et al., “Pseudo-healthy image synthesis for white matter lesion segmentation,” in Simul. Synth. Med. Imaging, 2016, pp. 87–96.
  23. N. Cordier, H. Delingette, M. Le, and N. Ayache, “Extended modality propagation: Image synthesis of pathological cases,” IEEE Trans. Med. Imag., vol. 35, pp. 2598–2608, 2016.
  24. T. Joyce, A. Chartsias, and S. A. Tsaftaris, “Robust multi-modal MR image synthesis,” in Med. Image Comput. Comput. Assist. Interv., 2017, pp. 347–355.
  25. W. Wei, E. Poirion, B. Bodini, S. Durrleman, O. Colliot, B. Stankoff, and N. Ayache, “Fluid-attenuated inversion recovery MRI synthesis from multisequence MRI using three-dimensional fully convolutional networks for multiple sclerosis,” J. Med. Imaging, vol. 6, no. 1, p. 014005, 2019.
  26. A. Beers, J. Brown, K. Chang, J. Campbell, S. Ostmo, M. Chiang, and J. Kalpathy-Cramer, “High-resolution medical image synthesis using progressively grown generative adversarial networks,” arXiv:1805.03144, 2018.
  27. S. U. Dar, M. Yurt, L. Karacan, A. Erdem, E. Erdem, and T. Çukur, “Image synthesis in multi-contrast MRI with conditional generative adversarial networks,” IEEE Trans. Med. Imag., vol. 38, no. 10, pp. 2375–2388, 2019.
  28. B. Yu, L. Zhou, L. Wang, J. Fripp, and P. Bourgeat, “3D cGAN based cross-modality MR image synthesis for brain tumor segmentation,” Int. Symp. Biomed. Imaging, pp. 626–630, 2018.
  29. D. Nie, R. Trullo, J. Lian, L. Wang, C. Petitjean, S. Ruan, and Q. Wang, “Medical image synthesis with deep convolutional adversarial networks,” IEEE Trans. Biomed. Eng., vol. 65, no. 12, pp. 2720–2730, 2018.
  30. K. Armanious, C. Jiang, M. Fischer, T. Küstner, T. Hepp, K. Nikolaou, S. Gatidis, and B. Yang, “MedGAN: Medical image translation using GANs,” Comput. Med. Imaging Grap., vol. 79, p. 101684, 2019.
  31. D. Lee, J. Kim, W.-J. Moon, and J. C. Ye, “CollaGAN: Collaborative GAN for missing image data imputation,” in Comput. Vis. Pattern Recognit., 2019, pp. 2487–2496.
  32. H. Li, J. C. Paetzold, A. Sekuboyina, F. Kofler, J. Zhang, J. S. Kirschke, B. Wiestler, and B. Menze, “DiamondGAN: Unified multi-modal generative adversarial networks for MRI sequences synthesis,” in Med. Image Comput. Comput. Assist. Interv., 2019, pp. 795–803.
  33. T. Zhou, H. Fu, G. Chen, J. Shen, and L. Shao, “Hi-Net: Hybrid-fusion network for multi-modal MR image synthesis,” IEEE Trans. Med. Imag., vol. 39, no. 9, pp. 2772–2781, 2020.
  34. H. Lan, A. Toga, and F. Sepehrband, “SC-GAN: 3D self-attention conditional GAN with spectral normalization for multi-modal neuroimaging synthesis,” bioRxiv:2020.06.09.143297, 2020.
  35. M. Yurt, S. U. Dar, A. Erdem, E. Erdem, K. K. Oguz, and T. Çukur, “mustGAN: multi-stream generative adversarial networks for MR image synthesis,” Med. Image. Anal., vol. 70, p. 101944, 2021.
  36. H. Yang, X. Lu, S.-H. Wang, Z. Lu, J. Yao, Y. Jiang, and P. Qian, “Synthesizing multi-contrast MR images via novel 3D conditional variational auto-encoding GAN,” Mob. Netw. Appl., vol. 26, pp. 1–10, 2021.
  37. X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Comput. Vis. Pattern Recognit., 2018, pp. 7794–7803.
  38. N. Kodali, J. Hays, J. Abernethy, and Z. Kira, “On convergence and stability of GANs,” arXiv:1705.07215, 2017.
  39. O. Oktay, J. Schlemper, L. L. Folgoc, M. J. Lee, M. Heinrich, K. Misawa, K. Mori, S. G. McDonagh, N. Hammerla, B. Kainz et al., “Attention U-Net: Learning where to look for the pancreas,” arXiv:1804.03999, 2018.
  40. H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” in Int. Conf. Mach. Learn., vol. 97, 2019, pp. 7354–7363.
  41. S. A. Kamran, K. F. Hossain, A. Tavakkoli, S. L. Zuckerbrod, K. M. Sanders, and S. A. Baker, “VTGAN: Semi-supervised retinal image synthesis and disease prediction using vision transformers,” arXiv:2104.06757, 2021.
  42. H.-C. Shin, A. Ihsani, S. Mandava, S. T. Sreenivas, C. Forster, J. Cha, and A. D. N. Initiative, “GANBERT: Generative adversarial networks with bidirectional encoder representations from transformers for MRI to PET synthesis,” arXiv:2008.04393, 2020.
  43. X. Zhang, X. He, J. Guo, N. Ettehadi, N. Aw, D. Semanek, J. Posner, A. Laine, and Y. Wang, “PTNet: A high-resolution infant MRI synthesizer based on transformer,” arXiv:2105.13993, 2021.
  44. O. Dalmaz, M. Yurt, and T. Çukur, “ResViT: Residual vision transformers for multi-modal medical image synthesis,” IEEE Trans Med Imaging, vol. 44, no. 10, pp. 2598–2614, 2022.
  45. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv:2010.11929, 2021.
  46. H. A. Bedel, I. Sivgin, O. Dalmaz, S. U. Dar, and T. Çukur, “Bolt: Fused window transformers for fmri time series analysis,” Med Image Anal, vol. 88, p. 102841, 2023.
  47. Z. Zhang, L. Yu, X. Liang, W. Zhao, and L. Xing, “TransCT: Dual-path transformer for low dose computed tomography,” in Med. Image Comput. Comput. Assist. Interv., 2021, pp. 55–64.
  48. L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision mamba: Efficient visual representation learning with bidirectional state space model,” arXiv:2401.09417, 2024.
  49. Y. Liu, Y. Tian, Y. Zhao, H. Yu, L. Xie, Y. Wang, Q. Ye, and Y. Liu, “Vmamba: Visual state space model,” arXiv:2401.10166, 2024.
  50. J. Ma, F. Li, and B. Wang, “U-mamba: Enhancing long-range dependency for biomedical image segmentation,” arXiv:2401.04722, 2024.
  51. Z. Xing, T. Ye, Y. Yang, G. Liu, and L. Zhu, “Segmamba: Long-range sequential modeling mamba for 3d medical image segmentation,” arXiv:2401.13560, 2024.
  52. J. Ruan and S. Xiang, “VM-UNet: Vision Mamba UNet for Medical Image Segmentation,” arXiv:2402.02491, 2024.
  53. Y. Yue and Z. Li, “Medmamba: Vision mamba for medical image classification,” arXiv:2403.03849, 2024.
  54. V. Kearney, B. P. Ziemer, A. Perry, T. Wang, J. W. Chan, L. Ma, O. Morin, S. S. Yom, and T. D. Solberg, “Attention-aware discrimination for MR-to-CT image translation using cycle-consistent generative adversarial networks,” Radiol. Artif. Intell., vol. 2, no. 2, p. e190027, 2020.
  55. J. Zhao, D. Li, Z. Kassam, J. Howey, J. Chong, B. Chen, and S. Li, “Tripartite-GAN: Synthesizing liver contrast-enhanced MRI to improve tumor detection,” Med. Image. Anal., vol. 63, p. 101667, 2020.
  56. Z. Yuan, M. Jiang, Y. Wang, B. Wei, Y. Li, P. Wang, W. Menpes-Smith, Z. Niu, and G. Yang, “SARA-GAN: Self-attention and relative average discriminator based generative adversarial networks for fast compressed sensing MRI reconstruction,” Front. Neuroinform., vol. 14, p. 58, 2020.
  57. M. Li, W. Hsu, X. Xie, J. Cong, and W. Gao, “SACNN: Self-attention convolutional neural network for low-dose CT denoising with self-supervised perceptual loss network,” IEEE Trans. Med. Imag., vol. 39, no. 7, pp. 2289–2301, 2020.
  58. Y. Xie, J. Zhang, C. Shen, and Y. Xia, “CoTr: Efficiently bridging CNN and transformer for 3D medical image segmentation,” arXiv:2103.03024, 2021.
  59. J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou, “TransUNet: Transformers make strong encoders for medical image segmentation,” arXiv:2102.04306, 2021.
  60. Y. Dai and Y. Gao, “TransMed: Transformers advance multi-modal medical image classification,” arXiv:2103.05940, 2021.
  61. Y. Luo, Y. Wang, C. Zu, B. Zhan, X. Wu, J. Zhou, D. Shen, and L. Zhou, “3D Transformer-GAN for high-quality PET reconstruction,” in Med. Image Comput. Comput. Assist. Interv., 2021, pp. 276–285.
  62. Y. Korkmaz, S. U. H. Dar, M. Yurt, M. Ozbey, and T. Cukur, “Unsupervised MRI reconstruction via zero-shot learned adversarial transformers,” IEEE Trans Med Imaging, vol. 41, no. 7, pp. 1747–1763, 2022.
  63. P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” Comput. Vis. Pattern Recognit., pp. 1125–1134, 2017.
  64. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Comput. Vis. Pattern Recognit., 2016, pp. 770–778.
  65. I. Tolstikhin, N. Houlsby, A. Kolesnikov, L. Beyer, X. Zhai, T. Unterthiner, J. Yung, A. Steiner, D. Keysers, J. Uszkoreit et al., “Mlp-mixer: An all-mlp architecture for vision,” 2021.
  66. L.-H. Chen, C. G. Bampis, Z. Li, C. Chen, and A. C. Bovik, “Convolutional block design for learned fractional downsampling,” arXiv:2105.09999, 2021.
  67. T. Nyholm, S. Svensson, S. Andersson, J. Jonsson, M. Sohlin, C. Gustafsson, E. Kjellén, K. Söderström, P. Albertsson, L. Blomqvist et al., “MR and CT data with multiobserver delineations of organs in the pelvic area—part of the gold atlas project,” Med. Phys., vol. 45, no. 3, pp. 1295–1300, 2018.
  68. M. Jenkinson and S. Smith, “A global optimisation methof for robust affine registration of brain images,” Med. Image. Anal., vol. 5, pp. 143–156, 2001.
  69. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Int. Conf. Learn. Represent., 2015.
  70. A. Sharma and G. Hamarneh, “Missing MRI pulse sequence synthesis using multi-modal generative adversarial network,” IEEE Trans. Med. Imag., vol. 39, pp. 1170–1183, 2020.
  71. G. Elmas, S. U. Dar, Y. Korkmaz, E. Ceyani, B. Susam, M. Özbey, S. Avestimehr, and T. Çukur, “Federated Learning of Generative Image Priors for MRI Reconstruction,” IEEE Trans Med Imaging, vol. 42, no. 7, pp. 1996–2009, 2023.
  72. O. Dalmaz, M. U. Mirza, G. Elmas, M. Ozbey, S. U. Dar, E. Ceyani, K. K. Oguz, S. Avestimehr, and T. Çukur, “One model to unite them all: Personalized federated learning of multi-contrast MRI synthesis,” Med Image Anal, vol. 94, p. 103121, 2024.
  73. J. Wolterink, A. M. Dinkla, M. Savenije, P. Seevinck, C. Berg, and I. Isgum, “Deep MR to CT synthesis using unpaired data,” in Simul. Synth. Med. Imaging, 2017, pp. 14–23.
  74. C.-B. Jin, H. Kim, M. Liu, W. Jung, S. Joo, E. Park, Y. S. Ahn, I. H. Han, J. I. Lee, and X. Cui, “Deep CT to MR synthesis using paired and unpaired data,” Sensors, vol. 19, no. 10, p. 2361, 2019.
  75. Y. Ge, D. Wei, Z. Xue, Q. Wang, X. Zhou, Y. Zhan, and S. Liao, “Unpaired MR to CT synthesis with explicit structural constrained adversarial learning,” in Int. Symp. Biomed. Imaging, 2019, pp. 1096–1099.
  76. B. Zhan, D. Li, Y. Wang, Z. Ma, X. Wu, J. Zhou, and L. Zhou, “LR-cGAN: Latent representation based conditional generative adversarial network for multi-modality MRI synthesis,” Biomed. Signal Process. Control, vol. 66, p. 102457, 2021.
  77. D. Nie and D. Shen, “Adversarial Confidence Learning for Medical Image Segmentation and Synthesis,” Int. J. Comput. Vision, vol. 128, no. 10, pp. 2494–2513, 2020.
  78. M. Özbey, S. U. Dar, H. A. Bedel, O. Dalmaz, Ş. Özturk, A. Güngör, and T. Çukur, “Unsupervised medical image translation with adversarial diffusion models,” IEEE Trans Med Imaging, vol. 42, no. 12, pp. 3524–3539, 2023.
  79. P. Dhariwal and A. Nichol, “Diffusion models beat gans on image synthesis,” in Adv Neural Inf Process Syst, vol. 34, 2021, pp. 8780–8794.
  80. F. Arslan, B. Kabas, O. Dalmaz, M. Ozbey, and T. Çukur, “Self-consistent recursive diffusion bridge for medical image translation,” arXiv:2405.06789, 2024.
  81. Z. Wang, Y. Yang, Y. Chen, T. Yuan, M. Sermesant, H. Delingette, and O. Wu, “Mutual information guided diffusion for zero-shot cross-modality medical image translation,” IEEE Trans Med Imaging, pp. 1–1, 2024.
  82. A. Güngör, S. U. Dar, Ş. Öztürk, Y. Korkmaz, G. Elmas, M. Özbey, and T. Çukur, “Adaptive diffusion priors for accelerated MRI reconstruction,” Med Image Anal, vol. 88, p. 102872, 2023.
  83. M. U. Mirza, O. Dalmaz, H. A. Bedel, G. Elmas, Y. Korkmaz, A. Gungor, S. U. Dar, and T. Çukur, “Learning Fourier-Constrained Diffusion Bridges for MRI Reconstruction,” arXiv:2308.01096, 2023.
  84. J. Kim and J. C. Ye, “HiCBridge: Resolution enhancement of hi-c data using direct diffusion bridge,” 2024. [Online]. Available: https://openreview.net/forum?id=RUvzlotXY0
  85. Y. Korkmaz, T. Cukur, and V. M. Patel, “Self-supervised mri reconstruction with unrolled diffusion models,” in MICCAI, 2023, pp. 491–501.
  86. H. A. Bedel and T. Çukur, “DreaMR: Diffusion-driven counterfactual explanation for functional MRI,” arXiv:2307.09547, 2023.
  87. J. Liu, H. Yang, H.-Y. Zhou, Y. Xi, L. Yu, Y. Yu, Y. Liang, G. Shi, S. Zhang, H. Zheng et al., “Swin-umamba: Mamba-based unet with imagenet-based pretraining,” arXiv:2402.03302, 2024.
Citations (6)

Summary

We haven't generated a summary for this paper yet.