Explaining why ChatGPT achieves partial accuracy in REF scoring
Ascertain the mechanisms by which ChatGPT-4 attains some degree of accuracy when estimating Research Excellence Framework (REF) 2021 quality scores; specifically, determine whether its performance primarily arises from inferring quality from author-stated claims within the article text rather than applying external knowledge.
References
It is not clear why it can score articles with some degree of accuracy, but it might typically deduce them from author claims inside an article rather than by primarily applying external information.
— Can ChatGPT evaluate research quality?
(2402.05519 - Thelwall, 8 Feb 2024) in Section 6 (Conclusion)