Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering (2312.06358v2)

Published 11 Dec 2023 in cs.CV

Abstract: Surgical decisions are informed by aligning rapid portable 2D intraoperative images (e.g., X-rays) to a high-fidelity 3D preoperative reference scan (e.g., CT). 2D/3D image registration often fails in practice: conventional optimization methods are prohibitively slow and susceptible to local minima, while neural networks trained on small datasets fail on new patients or require impractical landmark supervision. We present DiffPose, a self-supervised approach that leverages patient-specific simulation and differentiable physics-based rendering to achieve accurate 2D/3D registration without relying on manually labeled data. Preoperatively, a CNN is trained to regress the pose of a randomly oriented synthetic X-ray rendered from the preoperative CT. The CNN then initializes rapid intraoperative test-time optimization that uses the differentiable X-ray renderer to refine the solution. Our work further proposes several geometrically principled methods for sampling camera poses from $\mathbf{SE}(3)$, for sparse differentiable rendering, and for driving registration in the tangent space $\mathfrak{se}(3)$ with geodesic and multiscale locality-sensitive losses. DiffPose achieves sub-millimeter accuracy across surgical datasets at intraoperative speeds, improving upon existing unsupervised methods by an order of magnitude and even outperforming supervised baselines. Our code is available at https://github.com/eigenvivek/DiffPose.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Machine learning for automated and real-time two-dimensional to three-dimensional registration of the spine using a single radiograph. Neurosurgical Focus, 54(6):E16, 2023.
  2. Image-guided robotic radiosurgery. Neurosurgery, 44(6):1299–1306, 1999.
  3. On-the-fly augmented reality for orthopedic surgery using a multimodal fiducial. Journal of Medical Imaging, 5(2):021209–021209, 2018.
  4. Accurate and precise 2d–3d registration based on x-ray intensity. Computer vision and image understanding, 110(1):134–151, 2008.
  5. A reproducible evaluation of ANTs similarity metric performance in brain image registration. Neuroimage, 54(3):2033–2044, 2011.
  6. An unsupervised learning model for deformable medical image registration. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9252–9260, 2018.
  7. Marker-free motion correction in weight-bearing cone-beam ct of the knee joint. Medical physics, 43(3):1235–1248, 2016.
  8. José Luis Blanco-Claraco. A tutorial on SE(3) transformation parameterizations and on-manifold optimization. arXiv preprint arXiv:2103.15980, 2021.
  9. Learning less is more-6d camera localization via 3d surface regression. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4654–4662, 2018.
  10. William R Brody. Digital subtraction angiography. IEEE Transactions on Nuclear Science, 29(3):1176–1180, 1982.
  11. X-ray posenet: 6 dof pose estimation for mobile x-ray devices. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1036–1044. IEEE, 2017.
  12. Automatic inference and measurement of 3d carpal bone kinematics from single view fluoroscopic sequences. IEEE transactions on medical imaging, 32(2):317–328, 2012.
  13. Gregory S Chirikjian. Partial bi-invariance of SE(3) metrics. Journal of Computing and Information Science in Engineering, 15(1):011008, 2015.
  14. Ethan Eade. Lie groups for 2d and 3d transformations. http://ethaneade.com/lie.pdf, 117:118, 2013.
  15. Towards fully automatic x-ray to ct registration. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part VI 22, pages 631–639. Springer, 2019.
  16. Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning, pages 1126–1135. PMLR, 2017.
  17. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
  18. Fiducial-free 2d/3d registration for robot-assisted femoroplasty. IEEE transactions on medical robotics and bionics, 2(3):437–446, 2020.
  19. A fully differentiable framework for 2d/3d registration and the projective spatial transformers. IEEE transactions on medical imaging, 2023.
  20. Fast auto-differentiable digitally reconstructed radiographs for solving inverse problems in intraoperative imaging. In Workshop on Clinical Image-Based Procedures, pages 1–11. Springer, 2022.
  21. Patch-based image similarity for intraoperative 2D/3D pelvis registration during periacetabular osteotomy. In OR 2.0 Context-Aware Operating Theaters, Computer Assisted Robotic Endoscopy, Clinical Image-Based Procedures, and Skin Image Analysis: First International Workshop, OR 2.0 2018, 5th International Workshop, CARE 2018, 7th International Workshop, CLIP 2018, Third International Workshop, ISIC 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16 and 20, 2018, Proceedings 5, pages 153–163. Springer, 2018.
  22. Pose estimation of periacetabular osteotomy fragments with intraoperative x-ray navigation. IEEE Transactions on Biomedical Engineering, 67(2):441–452, 2019.
  23. Automatic annotation of hip anatomy in fluoroscopy for robust and efficient 2D/3D registration. International journal of computer assisted radiology and surgery, 15:759–769, 2020.
  24. Du Q Huynh. Metrics for 3D rotations: Comparison and analysis. Journal of Mathematical Imaging and Vision, 35:155–164, 2009.
  25. Deep iterative 2d/3d registration. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part IV 24, pages 383–392. Springer, 2021.
  26. Self-supervised 2d/3d registration for x-ray to ct image fusion. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2788–2798, 2023.
  27. PoseNet: A convolutional network for real-time 6-DOF camera relocalization. In Proceedings of the IEEE international conference on computer vision, pages 2938–2946, 2015.
  28. EPnP: An accurate O(n) solution to the PnP problem. International journal of computer vision, 81:155–166, 2009.
  29. A robust o (n) solution to the perspective-n-point problem. IEEE transactions on pattern analysis and machine intelligence, 34(7):1444–1450, 2012.
  30. Algebraically rigorous quaternion framework for the neural network pose estimation problem. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14097–14106, 2023.
  31. Low tube voltage and low contrast material volume cerebral ct angiography. European radiology, 24:1677–1685, 2014.
  32. Multimodality image registration by maximization of mutual information. IEEE transactions on Medical Imaging, 16(2):187–198, 1997.
  33. A CNN regression approach for real-time 2D/3D registration. IEEE Transactions on Medical Imaging, 35(5):1352–1363, 2016.
  34. Nerf: Representing scenes as neural radiance fields for view synthesis. In European Conference on Computer Vision, pages 405–421. Springer, 2020.
  35. 4d-foot: a fully automated pipeline of four-dimensional analysis of the foot bones using bi-plane x-ray video and ct. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part IV 24, pages 182–192. Springer, 2021.
  36. A mathematical introduction to robotic manipulation. CRC press, 1994.
  37. Automatic landmark detection and mapping for 2d/3d registration with bonenet. Frontiers in Veterinary Science, 9:923449, 2022.
  38. A smooth representation of belief over SO(3) for deep rotation learning with uncertainty. arXiv preprint arXiv:2006.01031, 2020.
  39. Franjo Pernus et al. 3d-2d registration of cerebral angiograms: a method and evaluation on clinical images. IEEE transactions on medical imaging, 32(8):1550–1563, 2013.
  40. Terry M Peters. Image-guided surgery: from x-rays to virtual reality. Computer methods in biomechanics and biomedical engineering, 4(1):27–57, 2001.
  41. Accelerating 3D deep learning with PyTorch3D. arXiv:2007.08501, 2020.
  42. Understanding the limitations of cnn-based absolute camera pose regression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3302–3312, 2019.
  43. 3d cerebral angiography: radiation dose comparison with digital subtraction angiography. American journal of neuroradiology, 26(8):1898–1901, 2005.
  44. X-ray imaging physics for nuclear medicine technologists. part 2: X-ray interactions and image formation. Journal of nuclear medicine technology, 33(1):3–18, 2005.
  45. X-ray to ct rigid registration using scene coordinate regression. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 781–790. Springer, 2023.
  46. Robert L Siddon. Fast calculation of the exact radiological path for a three-dimensional ct array. Medical physics, 12(2):252–255, 1985.
  47. Donald F Swinehart. The beer-lambert law. Journal of chemical education, 39(7):333, 1962.
  48. “gold standard” 2d/3d registration of x-ray to ct and mr images. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2002: 5th International Conference Tokyo, Japan, September 25–28, 2002 Proceedings, Part II 5, pages 461–468. Springer, 2002.
  49. Deepdrr–a catalyst for machine learning in fluoroscopy-guided procedures. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part IV 11, pages 98–106. Springer, 2018.
  50. The impact of machine learning on 2d/3d registration for image-guided interventions: A systematic review and perspective. Frontiers in Robotics and AI, 8:716007, 2021.
  51. Evaluation of optimization methods for intensity-based 2d-3d registration in x-ray guided interventions. In Medical Imaging 2011: Image Processing, pages 657–671. SPIE, 2011.
  52. Image-based localization using lstms for structured feature correlation. In Proceedings of the IEEE international conference on computer vision, pages 627–637, 2017.
  53. Group normalization. In Proceedings of the European conference on computer vision (ECCV), pages 3–19, 2018.
  54. A patient-specific self-supervised model for automatic X-Ray/CT registration. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 515–524. Springer, 2023.
  55. On the continuity of rotation representations in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5745–5753, 2019.
  56. 2d-3d rigid registration of X-ray fluoroscopy and CT images using mutual information and sparsely sampled histogram estimators. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, pages II–II. IEEE, 2001.
Citations (7)

Summary

  • The paper presents DiffPose, a novel self-supervised framework achieving sub-millimeter intraoperative 2D/3D image registration.
  • It applies differentiable X-ray rendering combined with Lie algebraic transformations to enhance accuracy and operational speed.
  • Evaluations on clinical datasets show DiffPose outperforming traditional methods, indicating strong potential for real-time surgical guidance.

Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering: An Overview

The paper presents DiffPose, a novel framework for intraoperative 2D/3D image registration utilizing differentiable X-ray rendering. DiffPose innovatively applies principles of self-supervision and physics-based rendering, achieving landmark-free, sub-millimeter registration accuracy through a process grounded in differentiating synthetic X-ray images. The incorporation of geometrical nuances and the use of Lie algebraic structures underpin the framework's design, aiming for rapid, precise, and clinically viable outcomes.

Summary of Approach

The authors address the challenges inherent in conventional 2D/3D registration methods by employing a self-supervised approach that leverages synthetic data from preoperative CT scans. Unlike traditional methods that rely on supervised learning with annotated landmarks or small datasets insufficient for generalization, DiffPose circumvents such limitations through a patient-specific Convolutional Neural Network (CNN) trained on unlimited synthetic X-rays. The CNN engages in self-supervised pose regression, initializing subsequent test-time optimization that refines the camera pose estimation with high precision.

Crucially, DiffPose operates in the Lie algebra $\se3$, improving pose estimation via the fundamental geometry of transformations. The multi-scale, local-normalized cross-correlation (mNCC) underpins the image similarity computation, stabilized utilizing a sparse rendering technique that is both computationally efficient and robust to local minima.

Key Findings

DiffPose's performance evaluation on datasets like DeepFluoro and Ljubljana demonstrates its efficacy across diverse clinical scenarios. The approach outperforms conventional and even some supervised techniques, achieving a sub-millimeter success rate substantially higher than alternatives, as evident in both success metrics and qualitative assessments.

The methodological application highlights include:

  • Demonstrating an unsupervised method surpassing baseline supervised methods in registration accuracy while maintaining operational speed conducive to surgical procedures.
  • Incorporating Lie theory to parameterize transformations within $\se3$, providing a sophisticated approach to camera pose estimation not seen in traditional Euler or quaternion-based methods.
  • Employing a sparse mNCC, demonstrating computational efficiency without sacrificing accuracy.

Implications and Future Directions

The implications of DiffPose are far-reaching, particularly in enhancing surgical precision through real-time, accurate image registration, fostering advancements in augmented reality and robotic surgery systems. The robustness across patient-specific anatomical variability indicates potential scalability and adaptability to other clinical imaging modalities and applications.

Future developments could explore extending DiffPose to handle deformable registrations or adapting its lengthy pretraining requirement for emergency scenarios using rapid pre-training techniques. Integrating direct applications in piecewise rigid transformations stands as a prospective avenue.

Despite its advancements, the possibility of integrating DiffPose within broader clinical workflows necessitates careful consideration of real-time constraints and diverse surgical environments. Additionally, exploring transfer learning mechanisms in initializing pose regressors may lead the methodologies to broader applicability without per-patient training requirements.

In conclusion, the paper presents a methodically sound approach to intraoperative 2D/3D image registration, offering a significant leap in surgical imaging technology with DiffPose, moving toward a future where real-time guidance systems become a surgical staple.

Github Logo Streamline Icon: https://streamlinehq.com