Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans (2404.11889v2)

Published 18 Apr 2024 in eess.IV and cs.CV

Abstract: X-ray images play a vital role in the intraoperative processes due to their high resolution and fast imaging speed and greatly promote the subsequent segmentation, registration and reconstruction. However, over-dosed X-rays superimpose potential risks to human health to some extent. Data-driven algorithms from volume scans to X-ray images are restricted by the scarcity of paired X-ray and volume data. Existing methods are mainly realized by modelling the whole X-ray imaging procedure. In this study, we propose a learning-based approach termed CT2X-GAN to synthesize the X-ray images in an end-to-end manner using the content and style disentanglement from three different image domains. Our method decouples the anatomical structure information from CT scans and style information from unpaired real X-ray images/ digital reconstructed radiography (DRR) images via a series of decoupling encoders. Additionally, we introduce a novel consistency regularization term to improve the stylistic resemblance between synthesized X-ray images and real X-ray images. Meanwhile, we also impose a supervised process by computing the similarity of computed real DRR and synthesized DRR images. We further develop a pose attention module to fully strengthen the comprehensive information in the decoupled content code from CT scans, facilitating high-quality multi-view image synthesis in the lower 2D space. Extensive experiments were conducted on the publicly available CTSpine1K dataset and achieved 97.8350, 0.0842 and 3.0938 in terms of FID, KID and defined user-scored X-ray similarity, respectively. In comparison with 3D-aware methods ($\pi$-GAN, EG3D), CT2X-GAN is superior in improving the synthesis quality and realistic to the real X-ray images.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Restyle: A residual-based stylegan encoder via iterative refinement. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6711–6720.
  2. Andreu Badal and Aldo Badano. 2009. Accelerating Monte Carlo Simulations of Photon Transport in a Voxelized Geometry Using a Massively Parallel Graphics Processing Unit: Monte Carlo Simulations in a Graphics Processing Unit. Medical Physics 36, 11 (2009), 4878–4880.
  3. Domain intersection and domain difference. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3445–3453.
  4. Yogesh H Bhosale and K Sridhar Patnaik. 2023. Bio-medical imaging (X-ray, CT, ultrasound, ECG), genome sequences applications of deep neural network and machine learning in diagnosis, detection, classification, and segmentation of COVID-19: a Meta-analysis & systematic review. Multimedia Tools and Applications 82, 25 (2023), 39157–39210.
  5. Demystifying mmd gans. arXiv preprint arXiv:1801.01401 (2018).
  6. Universeg: Universal medical image segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 21438–21451.
  7. Efficient Geometry-Aware 3D Generative Adversarial Networks. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, New Orleans, LA, USA, 16102–16112.
  8. Pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis. arXiv:2012.00926
  9. Meshgan: Non-linear 3d morphable models of faces. arXiv preprint arXiv:1903.10384 (2019).
  10. Editing in style: Uncovering the local semantics of gans. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5771–5780.
  11. Balázs Csébfalvi and László Szirmay-Kalos. 2003. Monte carlo volume rendering. In IEEE Visualization, 2003. VIS 2003. IEEE, 449–456.
  12. Ctspine1k: A large-scale dataset for spinal vertebrae segmentation in computed tomography. arXiv preprint arXiv:2105.14711 (2021).
  13. RealDRR – Rendering of Realistic Digitally Reconstructed Radiographs Using Locally Trained Image-to-Image Translation. Radiotherapy and Oncology 153 (2020), 213–219.
  14. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems 30 (2017).
  15. When is unsupervised disentanglement possible? Advances in Neural Information Processing Systems 34 (2021), 5150–5161.
  16. Sparse Bayesian Deep Learning for Cross Domain Medical Image Reconstruction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 2339–2347.
  17. Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision. 1501–1510.
  18. A survey on GANs for computer vision: Recent research, analysis and taxonomy. Computer Science Review 48 (2023), 100553.
  19. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1125–1134.
  20. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).
  21. A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv:1812.04948
  22. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 8110–8119.
  23. End-to-end convolutional neural network for 3D reconstruction of knee bones from bi-planar X-ray images. In Machine Learning for Medical Image Reconstruction: Third International Workshop, MLMIR 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 8, 2020, Proceedings 3. Springer, 123–133.
  24. 3d gaussian splatting for real-time radiance field rendering. ACM Transactions on Graphics 42, 4 (2023), 1–14.
  25. Deep learning image reconstruction for CT: technical principles and clinical prospects. Radiology 306, 3 (2023), e221257.
  26. i3PosNet: instrument pose estimation from X-ray in temporal bone surgery. International Journal of Computer Assisted Radiology and Surgery 15, 7 (2020), 1–9.
  27. High-resolution chest x-ray bone suppression using unpaired CT structural priors. IEEE transactions on medical imaging 39, 10 (2020), 3053–3063.
  28. Digitally Reconstructed Radiograph Generation by an Adaptive Monte Carlo Method. Physics in Medicine and Biology 51, 11 (2006), 2745–2752.
  29. Advances in 3D Generation: A Survey. arXiv preprint arXiv:2401.17807 (2024).
  30. Challenging common assumptions in the unsupervised learning of disentangled representations. In international conference on machine learning. PMLR, 4114–4124.
  31. Advancing Sustainable COVID-19 Diagnosis: Integrating Artificial Intelligence with Bioinformatics in Chest X-ray Analysis. Information 15, 4 (2024), 189.
  32. DeepDRR – A Catalyst for Machine Learning in Fluoroscopy-Guided Procedures. arXiv:1803.08606 [physics] (2018). arXiv:1803.08606
  33. Which Training Methods for GANs do actually Converge?. In Proceedings of the 35th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 80), Jennifer Dy and Andreas Krause (Eds.). PMLR, 3481–3490. https://proceedings.mlr.press/v80/mescheder18a.html
  34. Bernike Pasveer. 1989. Knowledge of shadows: the introduction of X-ray images in medicine. Sociology of Health & Illness 11, 4 (1989), 360–381.
  35. Pivotal tuning for latent-based editing of real images. ACM Transactions on graphics (TOG) 42, 1 (2022), 1–13.
  36. Sine Spin flat detector CT can improve cerebral soft tissue imaging: a retrospective in vivo study. European Radiology Experimental 8, 1 (2024), 1–8.
  37. Novel-view X-ray projection synthesis through geometry-integrated deep learning. Medical image analysis 77 (2022), 102372.
  38. Interpreting the latent space of gans for semantic face editing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9243–9252.
  39. Digital X-ray Tomography: Edited by VI Syryamkin. Red Square Scientific, Ltd.
  40. Nicholas Tsoulfanidis and Sheldon Landsberger. 2021. Measurement and detection of radiation. CRC press.
  41. Enabling machine learning in X-ray-based procedures via realistic simulation of image formation. International journal of computer assisted radiology and surgery 14 (2019), 1517–1528.
  42. Vivek Gopalakrishnan and Polina Golland. 2022. Fast Auto-Differentiable Digitally Reconstructed Radiographs for Solving Inverse Problems in Intraoperative Imaging. In Workshop on Clinical Image-Based Procedures. Switzerland. arXiv:2208.12737
  43. From Denoising Training to Test-Time Adaptation: Enhancing Domain Generalization for Medical Image Segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 464–474.
  44. X-ray computed tomography. Nature Reviews Methods Primers 1, 1 (2021), 18.
  45. Murf: Mutually reinforcing multi-modal image registration and fusion. IEEE transactions on pattern analysis and machine intelligence (2023).
  46. X2CT-GAN: reconstructing CT from biplanar X-rays with generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10619–10628.
  47. Techniques and challenges of image segmentation: A review. Electronics 12, 5 (2023), 1199.
  48. Stackgan++: Realistic image synthesis with stacked generative adversarial networks. IEEE transactions on pattern analysis and machine intelligence 41, 8 (2018), 1947–1962.
  49. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE conference on computer vision and pattern recognition. 586–595.
  50. TreeGAN: Incorporating Class Hierarchy into Image Generation. arXiv preprint arXiv:2009.07734 (2020).
  51. SDF-StyleGAN: Implicit SDF-Based StyleGAN for 3D Shape Generation. In Computer Graphics Forum, Vol. 41. Wiley Online Library, 52–63.
  52. MaTe3D: Mask-guided Text-based 3D-aware Portrait Editing. arXiv preprint arXiv:2312.06947 (2023).
  53. SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete Attribute. In Proceedings of the 30th ACM International Conference on Multimedia. 2513–2524.
  54. Advances and challenges in multimodal remote sensing image registration. IEEE Journal on Miniaturization for Air and Space Systems (2023).
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Lixing Tan (5 papers)
  2. Shuang Song (54 papers)
  3. Kangneng Zhou (4 papers)
  4. Chengbo Duan (1 paper)
  5. Lanying Wang (3 papers)
  6. Huayang Ren (1 paper)
  7. Linlin Liu (19 papers)
  8. Wei Zhang (1489 papers)
  9. Ruoxiu Xiao (5 papers)