Transformer-Based Local Feature Matching for Multimodal Image Registration (2404.16802v1)
Abstract: Ultrasound imaging is a cost-effective and radiation-free modality for visualizing anatomical structures in real-time, making it ideal for guiding surgical interventions. However, its limited field-of-view, speckle noise, and imaging artifacts make it difficult to interpret the images for inexperienced users. In this paper, we propose a new 2D ultrasound to 3D CT registration method to improve surgical guidance during ultrasound-guided interventions. Our approach adopts a dense feature matching method called LoFTR to our multimodal registration problem. We learn to predict dense coarse-to-fine correspondences using a Transformer-based architecture to estimate a robust rigid transformation between a 2D ultrasound frame and a CT scan. Additionally, a fully differentiable pose estimation method is introduced, optimizing LoFTR on pose estimation error during training. Experiments conducted on a multimodal dataset of ex vivo porcine kidneys demonstrate the method's promising results for intraoperative, trackerless ultrasound pose estimation. By mapping 2D ultrasound frames into the 3D CT volume space, the method provides intraoperative guidance, potentially improving surgical workflows and image interpretation.
- Wein, W., Ladikos, A., Fuerst, B., Shah, A., Sharma, K., and Navab, N., “Global registration of ultrasound to mri using the lc 2 metric for enabling neurosurgical guidance,” in [Medical Image Computing and Computer-Assisted Intervention–MICCAI 2013: 16th International Conference, Nagoya, Japan, September 22-26, 2013, Proceedings, Part I 16 ], 34–41, Springer (2013).
- San José Estépar, R., Westin, C.-F., and Vosburgh, K. G., “Towards real time 2d to 3d registration for ultrasound-guided endoscopic and laparoscopic procedures,” International journal of computer assisted radiology and surgery 4, 549–560 (2009).
- Hering, A., Hansen, L., Mok, T. C., Chung, A. C., Siebert, H., Häger, S., Lange, A., Kuckertz, S., Heldmann, S., Shao, W., et al., “Learn2reg: comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning,” IEEE Transactions on Medical Imaging 42(3), 697–712 (2022).
- Shapey, J., Dowrick, T., Delaunay, R., Mackle, E. C., Thompson, S., Janatka, M., Guichard, R., Georgoulas, A., Pérez-Suárez, D., Bradford, R., et al., “Integrated multi-modality image-guided navigation for neurosurgery: open-source software platform using state-of-the-art clinical hardware,” International journal of computer assisted radiology and surgery 16, 1347–1356 (2021).
- Montaña-Brown, N., Ramalhinho, J., Allam, M., Davidson, B., Hu, Y., and Clarkson, M. J., “Vessel segmentation for automatic registration of untracked laparoscopic ultrasound to ct of the liver,” International Journal of Computer Assisted Radiology and Surgery 16(7), 1151–1160 (2021).
- DeTone, D., Malisiewicz, T., and Rabinovich, A., “Superpoint: Self-supervised interest point detection and description,” in [Proceedings of the IEEE conference on computer vision and pattern recognition workshops ], 224–236 (2018).
- Sarlin, P.-E., DeTone, D., Malisiewicz, T., and Rabinovich, A., “Superglue: Learning feature matching with graph neural networks,” in [Proceedings of the IEEE/CVF conference on computer vision and pattern recognition ], 4938–4947 (2020).
- Lowe, D. G., “Object recognition from local scale-invariant features,” in [Proceedings of the seventh IEEE international conference on computer vision ], 2, 1150–1157, Ieee (1999).
- Bay, H., Ess, A., Tuytelaars, T., and Van Gool, L., “Speeded-up robust features (surf),” Computer vision and image understanding 110(3), 346–359 (2008).
- Sun, J., Shen, Z., Wang, Y., Bao, H., and Zhou, X., “Loftr: Detector-free local feature matching with transformers,” in [Proceedings of the IEEE/CVF conference on computer vision and pattern recognition ], 8922–8931 (2021).
- Markova, V., Ronchetti, M., Wein, W., Zettinig, O., and Prevost, R., “Global multi-modal 2d/3d registration via local descriptors learning,” in [Medical Image Computing and Computer Assisted Intervention–MICCAI 2022: 25th International Conference, Singapore, September 18–22, 2022, Proceedings, Part VI ], 269–279, Springer (2022).
- Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., and Belongie, S., “Feature pyramid networks for object detection,” in [Proceedings of the IEEE conference on computer vision and pattern recognition ], 2117–2125 (2017).
- Katharopoulos, A., Vyas, A., Pappas, N., and Fleuret, F., “Transformers are rnns: Fast autoregressive transformers with linear attention,” in [International Conference on Machine Learning ], 5156–5165, PMLR (2020).
- Jang, E., Gu, S., and Poole, B., “Categorical reparameterization with gumbel-softmax,” arXiv preprint arXiv:1611.01144 (2016).
- Choy, C., Dong, W., and Koltun, V., “Deep global registration,” in [Proceedings of the IEEE/CVF conference on computer vision and pattern recognition ], 2514–2523 (2020).
- Johnson, H., Harris, G., Williams, K., et al., “Brainsfit: mutual information rigid registrations of whole-brain 3d images, using the insight toolkit,” Insight J 57(1), 1–10 (2007).
- Ruisi Zhang (18 papers)
- Filipe C. Pedrosa (1 paper)
- Navid Feizi (2 papers)
- Dianne Sacco (1 paper)
- Rajni Patel (1 paper)
- Jayender Jagadeesan (7 papers)
- Remi Delaunay (2 papers)