Deep Learning Evaluation of Radiotherapy Plan Quality

Determine whether deep learning models can be reliably used to quantitatively evaluate radiotherapy treatment plan quality by developing calibrated and unbiased evaluation approaches and by selecting appropriate evaluation metrics tailored to specific clinical objectives.

Background

Quantitative evaluation of radiotherapy treatment plan quality is central to clinical decision-making and comparison of planning strategies. In this work, the authors assess GPT-4V’s ability to compare and rank plans, but they highlight that systematic deep learning-based evaluation remains challenging due to biases, calibration issues, and variability of clinical objectives.

The authors explicitly state that using deep learning for evaluation is still an open problem. This reflects broader concerns about the reliability and trustworthiness of deep learning models when used as evaluators, as well as the need to define and align evaluation metrics with diverse clinical goals across disease sites and institutions.

References

One important issue is quantitatively evaluating the plan quality. Generally, using deep learning for evaluation is still an open problem as deep learning models can be biased and uncalibrated, and different evaluation metrics can be proposed depending on the clinical objectives.

— Automated radiotherapy treatment planning guided by GPT-4Vision (2406.15609 - Liu et al., 21 Jun 2024) in Subsection 3.2, Assessment of GPT-4V on evaluating treatment plans

Deep Learning Evaluation of Radiotherapy Plan Quality

Background

References

Related Problems