Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-branch Cascaded Swin Transformers with Attention to k-space Sampling Pattern for Accelerated MRI Reconstruction (2207.08412v2)

Published 18 Jul 2022 in eess.IV, cs.AI, cs.CV, and cs.LG

Abstract: Global correlations are widely seen in human anatomical structures due to similarity across tissues and bones. These correlations are reflected in magnetic resonance imaging (MRI) scans as a result of close-range proton density and T1/T2 parameters. Furthermore, to achieve accelerated MRI, k-space data are undersampled which causes global aliasing artifacts. Convolutional neural network (CNN) models are widely utilized for accelerated MRI reconstruction, but those models are limited in capturing global correlations due to the intrinsic locality of the convolution operation. The self-attention-based transformer models are capable of capturing global correlations among image features, however, the current contributions of transformer models for MRI reconstruction are minute. The existing contributions mostly provide CNN-transformer hybrid solutions and rarely leverage the physics of MRI. In this paper, we propose a physics-based stand-alone (convolution free) transformer model titled, the Multi-head Cascaded Swin Transformers (McSTRA) for accelerated MRI reconstruction. McSTRA combines several interconnected MRI physics-related concepts with the transformer networks: it exploits global MR features via the shifted window self-attention mechanism; it extracts MR features belonging to different spectral components separately using a multi-head setup; it iterates between intermediate de-aliasing and k-space correction via a cascaded network with data consistency in k-space and intermediate loss computations; furthermore, we propose a novel positional embedding generation mechanism to guide self-attention utilizing the point spread function corresponding to the undersampling mask. Our model significantly outperforms state-of-the-art MRI reconstruction methods both visually and quantitatively while depicting improved resolution and removal of aliasing artifacts.

Citations (5)

Summary

  • The paper proposes McSTRA, a transformer-based MRI reconstruction model using shifted window self-attention to capture global k-space features.
  • It employs a dual-branch architecture that separately processes low- and high-frequency components to preserve structural and edge details.
  • The cascaded network with data consistency blocks and PSF-guided positional embeddings demonstrates superior performance under high acceleration and adverse conditions.

The paper "Multi-branch Cascaded Swin Transformers with Attention to k-space Sampling Pattern for Accelerated MRI Reconstruction" proposes a novel transformer-based architecture, named Multi-branch Cascaded Swin Transformers (McSTRA), aimed at enhancing MRI reconstruction, particularly in scenarios of accelerated data acquisition facilitated by k-space undersampling. Traditional convolutional neural network (CNN)-based models fall short in capturing global correlations due to the intrinsic locality of convolution operations, which limits their effectiveness in MRI reconstruction. This paper addresses these limitations by leveraging the power of transformers to capture global correlations effectively.

Key Contributions and Methodology:

  1. Shifted Window Self-Attention: The authors utilize a self-attention mechanism known as shifted window multi-head self-attention (SW-MSA) within the Swin transformer framework, which allows capturing global MR features through a robust partitioning and windowing mechanism. This is vital for distinguishing low- and high-frequency components in k-space, correlating with anatomical and edge features, respectively.
  2. Multi-branch Architecture: McSTRA employs a dual-branch architecture to separately process these different spectral components. Low-frequency components, containing structural information, are processed in one branch, while high-frequency components are handled in another, allowing the network to focus on extracting and preserving edge and resolution details vital for diagnostic quality.
  3. Cascaded Network Structure: McSTRA iterates between de-aliasing and k-space correction through a cascaded network structure interleaved with data consistency (DC) blocks. This iterative reconstruction framework ensures alignment with the physical constraints of the MRI acquisition process, effectively reducing artifacts and improving image quality.
  4. Novel Positional Embedding: A bespoke positional embedding generation mechanism based on the point spread function (PSF) of the sampling mask is proposed, guiding the self-attention layers in transformers to exploit the relationship between the PSF and the aliasing artifact patterns.
  5. Performance and Robustness: Quantitative assessments use metrics such as NMSE, PSNR, and SSIM, with McSTRA demonstrating superior performance over existing reconstruction techniques, both visually and quantitatively. The results are compelling, showing notable improvements across a range of adversarial conditions, including high acceleration factors, noisy inputs, varying undersampling techniques, out-of-distribution datasets, and anatomical abnormalities.
  6. Robustness to Adversarial Conditions: The paper emphasizes McSTRA's strength in maintaining performance under various unseen scenarios, such as inference on brain MRI data while being trained on knee MRI data, demonstrating its adaptability to different anatomical and modality variations. The model performed well even when dealing with MRI volumes containing lesions, where traditional methods often struggle due to the subtlety and variability of abnormal tissue presentations.

Ablation Studies and Component Efficacy:

A series of ablation studies underline the importance of each component in McSTRA. The multi-branch setup, cascaded intermediary loss computations, and PSF-guided positional embeddings are shown to contribute significantly to the improved reconstruction accuracy and robustness of McSTRA.

Limitations and Future Work:

While the model introduces promising advances, a key limitation is its current design focus on single-channel MR acquisitions. There is potential for extending McSTRA to handle multi-channel MRI data, which is more commonly practiced clinically. Future endeavors may also explore transitioning from fully supervised to unsupervised learning frameworks to reduce dependency on extensive labeled datasets, thus enhancing generalizability and robustness.

In conclusion, McSTRA provides a robust, convolution-free architecture leveraging the capabilities of transformers for MRI reconstruction. This approach embraces global attention mechanisms, offering a compelling improvement over conventional DL-based methods, and laying the groundwork for future transformer-centric imaging solutions in MRI diagnostics.