Multi-Grained Radiomics Attention (MGRA)
- MGRA is a fusion framework that synergizes handcrafted radiomics with deep learning using multi-grained attention for robust clinical predictions.
- It employs hierarchical modeling, deep feature mapping, and optimal transport–based co-attention to derive detailed and aggregated radiomics embeddings.
- MGRA improves outcome prediction and segmentation accuracy in clinical tasks like TIPS prognosis and lung cancer diagnosis, demonstrating enhanced model generalizability.
Multi-Grained Radiomics Attention (MGRA) is a feature interaction and aggregation paradigm designed to synergize handcrafted radiomics features with deep learning–derived representations in medical imaging. By leveraging multiple granularities of radiomics descriptors and advanced attention mechanisms, MGRA produces hierarchical, interpretable, and complementary multimodal embeddings. This approach is increasingly used to bolster outcome prediction, segmentation accuracy, and model generalizability in complex clinical tasks such as prognosis after Transjugular Intrahepatic Portosystemic Shunt (TIPS) (Dong et al., 12 Oct 2025), lung cancer diagnosis (Chen et al., 2021), and fusion-based subtype discrimination for lung adenocarcinoma (Zhou et al., 2023).
1. Hierarchical Modeling of Radiomics Features
MGRA exploits the natural hierarchy present in handcrafted radiomics attributes derived from medical images. Radiomics features are grouped according to filter type, spatial statistics, or anatomical region—e.g., intensity (first-order), texture (Haralick, GLCM), and morphological descriptors. This multilevel grouping enables MGRA to process information at:
- Fine granularity (Level I): Each feature (e.g., mean intensity, shape sphericity) serves as a distinct token.
- Grouped granularity (Level II): Features are partitioned into joint "filter–feature" groups; a typical implementation uses 103 groups, each encoded into a low-dimensional embedding via self-normalizing neural networks.
- Coarse granularity: Global aggregation of group embeddings via attention mechanisms yields summary tokens reflecting higher-order radiomics information.
This hierarchical architecture ensures that both detailed and broad patterns in the image data are systematically captured.
2. Deep Learning Feature Alignment and Fusion
MGRA fuses radiomics features with deep learning–derived voxel representations through a series of mapping and co-attention operations:
- Deep Feature Mapping: DL features (e.g., extracted from MedicalNet) are projected via shared fully connected layers into the same embedding space as radiomics groups.
- Optimal Transport–Based Co-Attention: Fine-grained co-attention is achieved by solving a discrete Kantorovich optimal transport problem. The cost matrix is the ℓ₂ distance between DL and radiomics tokens; the matching flow is computed using the generalized Sinkhorn–Knopp algorithm. Mathematically,
- Global Attention Pooling: Coarse-grained aggregation uses attention-weighted sums:
where attention weights are computed by a parameterized softmax over nonlinear transformations of .
This integration produces concatenated multimodal embeddings that combine complementary details and statistical summaries applicable to downstream prognostic or diagnostic tasks.
3. Multi-Stage Framework Integration
MGRA is integrated into a broader multimodal prognosis pipeline, particularly as described in the MultiTIPS framework for post-TIPS outcome prediction (Dong et al., 12 Oct 2025). Key stages include:
- Progressive Orthogonal Disentanglement (POD): After MGRA aggregation, the cosine similarity between DL and radiomics embeddings is tracked. Pairs with the highest redundancy are enforced to be orthogonal using dynamic thresholds, thereby suppressing shared information while preserving complementary content.
- Clinically Guided Prognostic Enhancement (CGPE): Clinical variables are encoded by dedicated neural networks, and transformer encoders process unimodal features. OT-based co-attention is then applied between multimodal imaging and clinical embeddings, refining the unified representation with clinical guidance. The final embedding for prediction concatenates guided features and original aggregated representations.
This modular architecture balances imaging and clinical information, enhancing robustness and discrimination in multi-task predictions such as survival, portal pressure gradients (PPG), and overt hepatic encephalopathy (OHE).
4. Experimental Validation and Quantitative Outcomes
MGRA demonstrates consistent performance improvements over unimodal and simpler fusion methods. Quantitative results in post-TIPS prognosis (Dong et al., 12 Oct 2025) and other domains show:
Metric | MGRA-based Model (%) | Competing Approach (%) | Context/Task |
---|---|---|---|
C-index (Survival) | +2−3 (absolute) | Lower | TIPS survival prediction |
Mean Brier Score (mBS) | Decreased | Higher | Survival calibration |
Dice Score | 0.774 ± 0.05 | Lower | MS lesion segmentation (Alsahanova et al., 17 Jun 2025) |
Validation SDD | 0.18 ± 0.09 | 0.21 ± 0.06 | Model stability in segmentation |
Ablation studies confirm that multi-grained fusion (using both fine and coarse tokens) yields more discriminative and generalizable models. External validation demonstrates robust cross-domain stratification (e.g., improved Kaplan–Meier separation).
5. Interpretability and Clinical Utility
MGRA’s hierarchical attention and co-attention mechanisms yield models whose predictions can be attributed to specific radiomics groups or anatomical regions. Integrated gradient analyses and attention heatmaps reveal the contribution of texture features (e.g., GLCM), spatial morphology, and even clinical indicators (MELD, Child-Pugh scores) to outcome predictions. This transparency supports clinical acceptance, facilitates biomarker discovery, and enables the adaptation of MGRA to new diseases or anatomical sites.
MGRA enables enhanced, non-invasive prognostic modeling for TIPS and similar interventions, with strengths in both interpretability and generalization. Limitations arise due to dependence on the quality of feature extraction, grouping strategies, and center-specific imaging protocols; future work may refine these aspects via more robust LLMs, self-supervised pretraining, extended anatomical coverage, and harmonization strategies.
6. Future Directions and Challenges
Further refinement of MGRA may involve:
- Incorporation of additional imaging modalities (e.g., MRI, PET) or anatomical structures for expanded applicability.
- Exploration of more advanced, possibly transformer-based, attention pooling and grouping algorithms to improve the granularity and robustness of the fusion.
- Scaling to larger, more heterogeneous, or longitudinal datasets to validate transferability and clinical generalization.
- Integration with weakly-supervised or unsupervised pretraining frameworks to alleviate annotation burdens.
Challenges remain in harmonizing radiomics extraction across sites and ensuring the reliability of optimal transport–based attention as data scale and complexity increase. However, early evidence suggests that multi-grained attention architectures such as MGRA significantly advance multimodal integration in medical AI systems.