XMorpher: Full Transformer for Deformable Medical Image Registration via Cross Attention (2206.07349v1)

Published 15 Jun 2022 in cs.CV

Abstract: An effective backbone network is important to deep learning-based Deformable Medical Image Registration (DMIR), because it extracts and matches the features between two images to discover the mutual correspondence for fine registration. However, the existing deep networks focus on single image situation and are limited in registration task which is performed on paired images. Therefore, we advance a novel backbone network, XMorpher, for the effective corresponding feature representation in DMIR. 1) It proposes a novel full transformer architecture including dual parallel feature extraction networks which exchange information through cross attention, thus discovering multi-level semantic correspondence while extracting respective features gradually for final effective registration. 2) It advances the Cross Attention Transformer (CAT) blocks to establish the attention mechanism between images which is able to find the correspondence automatically and prompts the features to fuse efficiently in the network. 3) It constrains the attention computation between base windows and searching windows with different sizes, and thus focuses on the local transformation of deformable registration and enhances the computing efficiency at the same time. Without any bells and whistles, our XMorpher gives Voxelmorph 2.8% improvement on DSC , demonstrating its effective representation of the features from the paired images in DMIR. We believe that our XMorpher has great application potential in more paired medical images. Our XMorpher is open on https://github.com/Solemoon/XMorpher

Citations (57)

View on Semantic Scholar

Summary

XMorpher: Full Transformer for Deformable Medical Image Registration via Cross Attention

In recent years, the field of Deformable Medical Image Registration (DMIR) has seen substantial advancements, particularly with the integration of deep learning (DL) methodologies. The paper "XMorpher: Full Transformer for Deformable Medical Image Registration via Cross Attention" introduces a novel transformer-based backbone network specifically designed to enhance feature extraction and matching capabilities in DMIR tasks, tackling challenges that existing Single Image Networks (SINs) face.

Key Innovations

XMorpher advances a full transformer architecture with dual parallel feature extraction networks, which leverage cross attention to process paired images effectively. The central contributions of the paper are outlined as follows:

Full Transformer Backbone: XMorpher is designed around a full transformer architecture, departing from traditional convolutional neural networks. The core innovation lies in its ability to simultaneously extract and process feature representations from paired images using dual parallel networks. These networks communicate continuously through cross-attention-based modules, ensuring effective semantic correspondence across different levels of features for precise registration.
Cross Attention Transformer (CAT) Blocks: The paper introduces CAT blocks, which enable efficient inter-image correspondence determination by computing attention weights between paired images. This mechanism allows the network to focus on relevant features across image boundaries, facilitating more coherent and precise registration outcomes.
Window-Based Local Feature Matching: XMorpher incorporates multi-size window partitioning techniques that constrain feature matching processes to localized areas, thereby improving computational efficiency and precision. This approach limits the search range to local transformations necessary for deformable registration, enhancing both accuracy and efficiency.

Experimental Validation

The paper applies XMorpher in two different frameworks: unsupervised Voxelmorph and semi-supervised PC-Reg, demonstrating significant improvements in both scenarios. The experiments are conducted on datasets from the MM-WHS 2017 Challenge and ASOCA, focusing on whole heart registration tasks. Results indicate:

Performance Metrics: XMorpher improved Dice Similarity Coefficient (DSC) scores by up to 2.8% compared to Voxelmorph, marking significant enhancement in registering accuracy. It also maintained competitive performance on Jacobian matrix evaluations, highlighting strong preservation of anatomical structures.
Visual Superiority: XMorpher consistently produced more accurate visual results with smoother boundaries and reduced registration grid distortion, outperforming benchmarks like Transmorph and PC-Reg.

Implications and Future Directions

The introduction of XMorpher marks a pivotal shift towards using transformers in DMIR, particularly due to its ability to identify and leverage cross-image features efficiently. Practically, XMorpher has the potential to improve diagnostic precision significantly and streamline image analysis workflows in clinical settings. Theoretically, the paper fosters deeper exploration into cross-attention mechanisms within medical imaging contexts, suggesting broader applications in handling diverse paired image tasks.

Future research may explore enhancements in attention mechanisms to further refine and speed up registration processes. Moreover, extending XMorpher's architecture to support more varied medical imaging protocols or modalities could yield comprehensive solutions in medical image analytics.

Overall, XMorpher represents a progressive step in transformer-based models for medical imaging, setting a promising trajectory for future innovations in image registration and related fields.

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Related Papers

Authors (7)

GitHub

GitHub - Solemoon/XMorpher: Open source for MICCAI2022 paper: [XMorpher: Full Transformer for Deformable Medical Image Registration via Cross Attention] (66 stars)