Scalable Deep Unrolled Model for MRI Reconstruction
- Scalable Deep Unrolled Model (SDUM) is a universal, physics-informed deep learning framework that unrolls MRI reconstruction into modular cascades integrating physics-based data consistency and transformer priors.
- The architecture employs a Restormer-based reconstructor, a learned coil sensitivity map estimator, and a sampling-aware weighted data consistency module to achieve state-of-the-art reconstruction quality.
- Empirical evaluations reveal log-linear PSNR scaling with model depth and parameter count, supporting SDUM’s practical application as a foundation model for universal MRI reconstruction.
The Scalable Deep Unrolled Model (SDUM) is a universal, physics-informed deep neural network framework for magnetic resonance imaging (MRI) reconstruction that generalizes across a broad spectrum of acquisitions, anatomical targets, and clinical protocols without requiring per-protocol retraining. SDUM integrates a Restormer-based reconstructor, a learned coil sensitivity map estimator (CSME), a sampling-aware weighted data consistency module (SWDC), universal conditioning on cascade index and protocol metadata, and a progressive cascade expansion training schedule. The architecture demonstrates foundation-model-like log-linear scaling in reconstruction quality with respect to model depth and parameter count, enabling predictable performance improvements as compute resources are invested. Empirical validation shows state-of-the-art results on multiple cardiac and brain MRI reconstruction tasks, establishing SDUM as a practical path toward universal and scalable MRI reconstruction (Wang et al., 19 Dec 2025).
1. Architectural Components
SDUM employs a modular, unrolled framework composed of alternating cascades, each integrating three principal blocks: the coil-sensitivity map estimator (CSME), the sampling-aware weighted data consistency (SWDC), and the Restormer-based prior (the reconstructor). Across all cascades, universal conditioning (UC) is injected based on the cascade index and protocol metadata.
Key components:
- Restormer-Based Reconstructor: Each cascade's proximal operator is implemented via a two-level Restormer block, employing Multi-Dconv Transposed Attention (MDTA) and Gated-Dconv Feedforward Networks (GDFN), structured into a shallow pyramid for preservation of high-frequency details and modeling of long-range aliasing patterns.
- Learned Coil Sensitivity Map Estimator (CSME): A U-Net predicts per-coil sensitivity maps at each cascade from the current image estimate , with normalization ensuring for each pixel .
- Sampling-Aware Weighted Data Consistency (SWDC): SWDC predicts a k-space spatial weight map , conditioned on the sampling mask, emphasizing well-sampled and low-frequency regions.
- Universal Conditioning (UC): Sinusoidal embeddings of each cascade index and discrete protocol label are mapped through MLPs and injected as additive biases into every MDTA/GDFN block, enabling adaptive behavior of the network across different acquisition settings.
- Progressive Cascade Expansion: The cascade depth is progressively increased during training; endpoints remain fixed while interior cascades are duplicated and fine-tuned at each expansion, stabilizing deep network optimization for unroll depths up to 18 cascades.
2. Mathematical Framework
SDUM is formulated as a proximal-gradient unrolled network, leveraging MRI physics and data consistency. Given masked Fourier operator , cascade step , and multi-coil k-space data :
- CSME Update:
- Data Consistency (SWDC):
- Reconstructor Update:
- Universal Conditioning:
Within each Restormer block, the conditioning vector is projected and added as a channel-wise bias.
- Progressive Cascade Expansion: Given previous cascade count , the new depth is , with an endpoint and interior-doubling index mapping for weight initialization.
- Loss Function: Training is performed with a differentiable SSIM loss (optionally augmented with or terms):
where .
- Scaling Law: Empirical measurements show (where is the number of parameters), with correlation and , highlighting foundation-model-like predictability with unrolled depth.
3. Training Strategy and Dataset Heterogeneity
SDUM is explicitly trained on pooled, heterogeneous datasets, encompassing diverse anatomical regions, contrasts, sampling patterns, and acceleration factors:
- CMRxRecon2024: 330 volunteers, six cardiac contrasts (Cine, T1/T2 mapping, LGE, Tagging, Aorta), uniform and random sampling across 4×–24× accelerations.
- CMRxRecon2025: 600 subjects across five centers, multi-disease, multiple field strengths (1.5T, 3T, 5T), pediatric, Gauss/Cartesian/radial masks, multiple contrasts (Cine/T1/T2/LGE/B1-mapping), spanning four challenge tracks.
- fastMRI Brain: ~7,000 multi-coil volumes (T1, T2, FLAIR), sampled at 4× and 6× accelerations.
Optimization protocol:
- Progressive Cascade Expansion: Trained in three stages (6, 10, and 18 cascades), ~20k steps per stage.
- Optimizer: Muon, base learning rate 2.4 ×10⁻⁴, cosine decay, weight decay 1×10⁻³, mixed precision (BF16).
- Data Augmentation: Random flips, k-space shifts/phase variation, gamma contrast perturbations, random masks from multiple families.
- Batch Size: Dynamically chosen per GPU memory constraints (gradient checkpointing used).
4. Empirical Evaluation
SDUM’s performance was benchmarked on standard and challenge datasets, consistently exceeding prior state-of-the-art baselines:
| Dataset/Task | Baseline | PSNR (dB) | SDUM (dB) | Δ PSNR (dB) |
|---|---|---|---|---|
| CMRxRecon2024 Task 1 | PromptMR+ (32×) | 35.15 | 35.70 | +0.55 |
| CMRxRecon2024 Task 2 | PromptMR+ (32×) | 33.81 | 33.95 | +0.14 |
| CMRxRecon2025 Multi-center | PromptMR+ | 32.92 | 33.18 | +0.26 |
| CMRxRecon2025 Multi-disease | PromptMR+ | 33.42 | 33.54 | +0.12 |
| CMRxRecon2025 5T | PromptMR+ | 33.82 | 34.23 | +0.40 |
| CMRxRecon2025 Pediatric | GENRE-CMR | 31.66 | 33.48 | +1.82 |
| fastMRI Brain Accel 4× | PC-RNN | 40.8 | 42.6 | +1.8 |
| fastMRI Brain Accel 6× | PC-RNN | 38.9 | 40.8 | +1.9 |
Ablation studies confirm individual utility: SWDC contributes +0.43 dB over standard DC, per-cascade CSME yields +0.51 dB, and UC provides +0.38 dB.
5. Scientific and Clinical Implications
SDUM demonstrates that a single deep unrolled model can universalize MRI reconstruction across protocols, field strengths, acquisition geometries, and clinical subgroups without retraining or protocol-specific tuning. The log-linear PSNR scaling law with respect to parameter count justifies investment in additional compute resources analogous to trends observed in NLP and computer vision. Each architecture component (CSME, SWDC, UC, Restormer) is modular and amenable to further replacement or enhancement (e.g., with 3D priors or self-supervised objectives) while retaining the core unrolled formulation. Fast inference (0.3–4 seconds/slice on H100 GPUs) and robust zero-shot generalization to unseen acquisition types (e.g., CEST) support readiness for prospective clinical deployment.
6. Outlook and Extensibility
SDUM establishes a universal, scalable foundation model paradigm for MRI reconstruction, integrating physics-based unrolling, transformer architectures, learned coil sensitivity estimation, sampling-adaptive data consistency, and metadata conditioning. Its empirically proven modularity, extensibility, and scaling properties position it as a potent platform for further research in MRI foundation models, advanced regularization schemes, and generalization across multi-institutional clinical infrastructures (Wang et al., 19 Dec 2025).