Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Swin UNETR++: Advancing Transformer-Based Dense Dose Prediction Towards Fully Automated Radiation Oncology Treatments (2311.06572v3)

Published 11 Nov 2023 in eess.IV and cs.CV

Abstract: The field of Radiation Oncology is uniquely positioned to benefit from the use of artificial intelligence to fully automate the creation of radiation treatment plans for cancer therapy. This time-consuming and specialized task combines patient imaging with organ and tumor segmentation to generate a 3D radiation dose distribution to meet clinical treatment goals, similar to voxel-level dense prediction. In this work, we propose Swin UNETR++, that contains a lightweight 3D Dual Cross-Attention (DCA) module to capture the intra and inter-volume relationships of each patient's unique anatomy, which fully convolutional neural networks lack. Our model was trained, validated, and tested on the Open Knowledge-Based Planning dataset. In addition to metrics of Dose Score $\overline{S_{\text{Dose}}}$ and DVH Score $\overline{S_{\text{DVH}}}$ that quantitatively measure the difference between the predicted and ground-truth 3D radiation dose distribution, we propose the qualitative metrics of average volume-wise acceptance rate $\overline{R_{\text{VA}}}$ and average patient-wise clinical acceptance rate $\overline{R_{\text{PA}}}$ to assess the clinical reliability of the predictions. Swin UNETR++ demonstrates near-state-of-the-art performance on validation and test dataset (validation: $\overline{S_{\text{DVH}}}$=1.492 Gy, $\overline{S_{\text{Dose}}}$=2.649 Gy, $\overline{R_{\text{VA}}}$=88.58%, $\overline{R_{\text{PA}}}$=100.0%; test: $\overline{S_{\text{DVH}}}$=1.634 Gy, $\overline{S_{\text{Dose}}}$=2.757 Gy, $\overline{R_{\text{VA}}}$=90.50%, $\overline{R_{\text{PA}}}$=98.0%), establishing a basis for future studies to translate 3D dose predictions into a deliverable treatment plan, facilitating full automation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Dual cross-attention for medical image segmentation. arXiv preprint arXiv:2303.17696, 2023.
  2. Openkbp: the open-access knowledge-based planning grand challenge and dataset. Medical Physics, 48(9):5549–5561, 2021.
  3. Monai: An open-source framework for deep learning in healthcare. arXiv preprint arXiv:2211.02701, 2022.
  4. 3d u-net: learning dense volumetric segmentation from sparse annotation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, October 17-21, 2016, Proceedings, Part II 19, pages 424–432. Springer, 2016.
  5. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  6. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3146–3154, 2019.
  7. Swin unetr: Swin transformers for semantic segmentation of brain tumors in mri images. In International MICCAI Brainlesion Workshop, pages 272–284. Springer, 2021.
  8. Unetr: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 574–584, 2022.
  9. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  10. Generating deliverable dicom rt treatment plans for prostate vmat by predicting mlc motion sequences with an encoder-decoder network. Medical Physics, 2023.
  11. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017.
  12. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2):203–211, 2021.
  13. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  14. Exploration of clinical preferences in treatment planning of radiotherapy for prostate cancer using pareto fronts and clinical grading analysis. Physics and imaging in radiation oncology, 14:82–86, 2020.
  15. A cascade 3d u-net for dose prediction in radiotherapy. Medical physics, 48(9):5574–5582, 2021a.
  16. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021b.
  17. Artificial intelligence guided physician directive improves head and neck planning quality and practice uniformity: a prospective study. International Journal of Radiation Oncology, Biology, Physics, 111(3):S44, 2021.
  18. Cs2-net: Deep learning segmentation of curvilinear structures in medical imaging. Medical Image Analysis, 67:101874, 2021. ISSN 1361-8415. https://doi.org/10.1016/j.media.2020.101874. URL https://www.sciencedirect.com/science/article/pii/S1361841520302383.
  19. Andriy Myronenko. 3d mri brain tumor segmentation using autoencoder regularization. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 4th International Workshop, BrainLes 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 16, 2018, Revised Selected Papers, Part II 4, pages 311–320. Springer, 2019.
  20. Generating pareto optimal dose distributions for radiation therapy treatment planning. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part VI 22, pages 59–67. Springer, 2019a.
  21. 3d radiotherapy dose prediction on head and neck cancer patients with a hierarchically densely connected u-net deep learning architecture. Physics in medicine & Biology, 64(6):065020, 2019b.
  22. A feasibility study for predicting optimal radiation therapy dose distributions of prostate cancer patients from patient anatomy using deep learning. Scientific reports, 9(1):1076, 2019c.
  23. Incorporating human and learned domain knowledge into training deep neural networks: a differentiable dose-volume histogram and adversarial inspired framework for generating pareto optimal dose distributions in radiation therapy. Medical physics, 47(3):837–849, 2020.
  24. Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999, 2018.
  25. Karl Otto. Volumetric modulated arc therapy: Imrt in a single gantry arc. Medical physics, 35(1):310–317, 2008.
  26. Towards bridging semantic gap to improve semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4230–4239, 2019.
  27. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pages 234–241. Springer, 2015.
  28. Prospective study of artificial intelligence-based decision support to improve head and neck radiotherapy plan quality. Clinical and translational radiation oncology, 29:65–70, 2021.
  29. A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv preprint arXiv:1902.09063, 2019.
  30. Deepdosenet: A deep learning model for 3d dose prediction in radiation therapy. arXiv preprint arXiv:2111.00077, 2021.
  31. Self-supervised pre-training of swin transformers for 3d medical image analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20730–20740, 2022.
  32. E-du: Deep neural network for multimodal medical image segmentation based on semantic gap compensation. Computers in Biology and Medicine, 151:106206, 2022.
  33. Patient geometry-driven information retrieval for imrt treatment plan quality control. Medical physics, 36(12):5497–5505, 2009.
Citations (2)

Summary

  • The paper introduces Swin UNETR++ with a lightweight 3D Dual Cross-Attention module that enhances dense dose prediction in radiation oncology.
  • The model achieved quantitative DVH and Dose Scores of approximately 1.492 Gy and 2.649 Gy on validation, with high clinical acceptance rates on test data.
  • It integrates patient imaging and segmentation for voxel-level prediction, supporting the automation of complex radiation treatment plan generation.

The paper entitled "Swin UNETR++: Advancing Transformer-Based Dense Dose Prediction Towards Fully Automated Radiation Oncology Treatments" addresses a critical challenge in the field of Radiation Oncology—automating the generation of radiation treatment plans for cancer therapy. This task involves integrating patient imaging with organ and tumor segmentation to develop a 3D radiation dose distribution that aligns with clinical treatment goals, which is akin to voxel-level dense prediction.

The authors introduce Swin UNETR++, an advanced model based on the transformer architecture that incorporates a lightweight 3D Dual Cross-Attention (DCA) module. This module is designed to capture both intra- and inter-volume relationships specific to each patient's anatomy, a capability that traditional fully convolutional neural networks generally lack.

The model was extensively trained, validated, and tested using the Open Knowledge-Based Planning dataset. The performance of Swin UNETR++ was assessed using both quantitative and qualitative metrics:

  1. Quantitative Metrics:
    • Dose Score ($\overline{S_{\text{Dose}}$) and DVH Score ($\overline{S_{\text{DVH}}$): These metrics measure the difference between the predicted and the actual (ground-truth) 3D radiation dose distributions.
    • On the validation dataset, the model achieved:
      • $\overline{S_{\text{DVH}} = 1.492 \, \text{Gy}$
      • $\overline{S_{\text{Dose}} = 2.649 \, \text{Gy}$
    • On the test dataset, it obtained:
      • $\overline{S_{\text{DVH}} = 1.634 \, \text{Gy}$
      • $\overline{S_{\text{Dose}} = 2.757 \, \text{Gy}$
  2. Qualitative Metrics:
    • The paper introduces two new qualitative metrics to evaluate the clinical reliability of the model:
      • Average Volume-Wise Acceptance Rate ($\overline{R_{\text{VA}}$)
      • Average Patient-Wise Clinical Acceptance Rate ($\overline{R_{\text{PA}}$)
    • In the validation set, the model demonstrated:
      • $\overline{R_{\text{VA}} = 88.58\%$
      • $\overline{R_{\text{PA}} = 100.0\%$
    • In the test set, these rates were:
      • $\overline{R_{\text{VA}} = 90.50\%$
      • $\overline{R_{\text{PA}} = 98.0\%$

These results highlight Swin UNETR++ as nearly state-of-the-art in automating radiation dose prediction, underscoring its potential to facilitate full automation of radiation treatment plan creation. The combination of high quantitative performance and robust clinical acceptance rates provides a solid foundation for future research aimed at translating 3D dose predictions into practical, deliverable treatment plans.