SDR-Former: A Siamese Dual-Resolution Transformer for Liver Lesion Classification Using 3D Multi-Phase Imaging
Abstract: Automated classification of liver lesions in multi-phase CT and MR scans is of clinical significance but challenging. This study proposes a novel Siamese Dual-Resolution Transformer (SDR-Former) framework, specifically designed for liver lesion classification in 3D multi-phase CT and MR imaging with varying phase counts. The proposed SDR-Former utilizes a streamlined Siamese Neural Network (SNN) to process multi-phase imaging inputs, possessing robust feature representations while maintaining computational efficiency. The weight-sharing feature of the SNN is further enriched by a hybrid Dual-Resolution Transformer (DR-Former), comprising a 3D Convolutional Neural Network (CNN) and a tailored 3D Transformer for processing high- and low-resolution images, respectively. This hybrid sub-architecture excels in capturing detailed local features and understanding global contextual information, thereby, boosting the SNN's feature extraction capabilities. Additionally, a novel Adaptive Phase Selection Module (APSM) is introduced, promoting phase-specific intercommunication and dynamically adjusting each phase's influence on the diagnostic outcome. The proposed SDR-Former framework has been validated through comprehensive experiments on two clinical datasets: a three-phase CT dataset and an eight-phase MR dataset. The experimental results affirm the efficacy of the proposed framework. To support the scientific community, we are releasing our extensive multi-phase MR dataset for liver lesion analysis to the public. This pioneering dataset, being the first publicly available multi-phase MR dataset in this field, also underpins the MICCAI LLD-MMRI Challenge. The dataset is accessible at:https://bit.ly/3IyYlgN.
- Cancer statistics for the year 2020: An overview. International journal of cancer, 149(4):778–789, 2021.
- Ct and mri liver imaging reporting and data system version 2018 for hepatocellular carcinoma: a systematic review with meta-analysis. Journal of the American College of Radiology, 17(10):1199–1206, 2020.
- Assessing the classification of liver focal lesions by using multi-phase computer tomography scans. In Medical Content-Based Retrieval for Clinical Decision Support, pages 80–91. Springer, 2013.
- A unified level set framework combining hybrid algorithms for liver and liver tumor segmentation in ct images. BioMed research international, 2018, 2018.
- Rare benign liver tumors that require differentiation from hepatocellular carcinoma: focus on diagnosis and treatment. Journal of Cancer Research and Clinical Oncology, pages 1–12, 2022.
- A review of deep learning in medical imaging: Imaging traits, technology trends, case studies with progress highlights, and future promises. Proceedings of the IEEE, 109(5):820–838, 2021.
- Deep learning with convolutional neural network for differentiation of liver masses at dynamic contrast-enhanced ct: a preliminary study. Radiology, 286(3):887–896, 2018.
- Multimodal brain tumor classification using deep learning and robust feature selection: A machine learning application for radiologists. Diagnostics, 10(8):565, 2020.
- Co-heterogeneous and adaptive segmentation from multi-source and multi-phase ct imaging data: A study on pathological liver and lesion segmentation. In European Conference on Computer Vision, pages 448–465. Springer, 2020.
- Transbts: Multimodal brain tumor segmentation using transformer. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 109–119. Springer, 2021.
- Combining convolutional and recurrent neural networks for classification of focal liver lesions in multi-phase ct images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 666–675. Springer, 2018.
- Multi-phase and multi-level selective feature fusion for automated pancreas segmentation from ct images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 460–469. Springer, 2020.
- M3net: A multi-scale multi-view framework for multi-phase pancreas segmentation based on cross-phase non-local attention. Medical Image Analysis, 75:102232, 2022.
- Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7794–7803, 2018.
- Mscs-deepln: Evaluating lung nodule malignancy using multi-scale cost-sensitive neural networks. Medical Image Analysis, 65:101772, 2020.
- Hooknet: Multi-resolution convolutional neural networks for semantic segmentation in histopathology whole-slide images. Medical Image Analysis, 68:101890, 2021.
- Semi-supervised adversarial learning for improving the diagnosis of pulmonary nodules. IEEE Journal of Biomedical and Health Informatics, 2022.
- How do vision transformers work? In Proceedings of International Conference on Learning Representations, 2022.
- Signature verification using a" siamese" time delay neural network. Advances in neural information processing systems, 6, 1993.
- Stmtrack: Template-free visual tracking with space-time memory networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13774–13783, 2021.
- Deep metric learning for few-shot image classification: A review of recent developments. Pattern Recognition, page 109381, 2023.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15750–15758, 2021.
- Few-shot medical image segmentation regularized with self-reference and contrastive learning. In Medical Image Computing and Computer Assisted Intervention, pages 514–523. Springer, 2022.
- Attention is all you need. Advances in Neural Information Processing Systems, 30, 2017.
- An image is worth 16×\times×16 words: Transformers for image recognition at scale. In Proceedings of International Conference on Learning Representations, 2021.
- Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10012–10022, 2021.
- Pyramid vision transformer: A versatile backbone for dense prediction without convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 568–578, 2021.
- P2t: Pyramid pooling transformer for scene understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022.
- Bottleneck transformers for visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16519–16529, 2021.
- Uniformer: Unifying convolution and self-attention for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- Maxvit: Multi-axis vision transformer. In European Conference on Computer Vision, 2022.
- nnformer: volumetric medical image segmentation via a 3d transformer. IEEE Transactions on Image Processing, 2023.
- H2former: An efficient hierarchical hybrid transformer for medical image segmentation. IEEE Transactions on Medical Imaging, 2023.
- Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision, pages 286–301, 2018.
- Second-order attention network for single image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11065–11074, 2019.
- Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
- Radiologic-pathologic correlation of hepatobiliary phase hypointense nodules without arterial phase hyperenhancement at gadoxetic acid–enhanced mri: a multicenter study. Radiology, 296(2):335–345, 2020.
- Multiple paragangliomas of head and neck associated with hepatic paraganglioma: a case report. BMC Medical Imaging, 15(1):1–6, 2015.
- Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4700–4708, 2017.
- Deep learning for differential diagnosis of malignant hepatic tumors based on multi-phase contrast-enhanced ct and clinical data. Journal of hematology & oncology, 14(1):1–7, 2021.
- Early convolutions help transformers see better. Advances in Neural Information Processing Systems, 34:30392–30400, 2021.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. International Journal of Computer Vision, 128(2):336–360, 2020.
- Medmnist v2-a large-scale lightweight benchmark for 2d and 3d biomedical image classification. Scientific Data, 10(1):41, 2023.
- Dynamic convolution: Attention over convolution kernels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11030–11039, 2020.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.