Disentangle ChatGPT bias from genuine international quality differences in country-associated score gaps
Determine whether the observed differences in ChatGPT 4o-mini REF-style research quality scores across first-author countries are caused by bias within ChatGPT, underlying international differences in research quality, or both, and quantify the relative contribution of each factor within each Scopus broad field to enable fair international comparisons.
References
The first author country differences found could indicate ChatGPT bias and/or underlying international differences in the quality of research, with the latter being widely believed to occur by policy makers. Further research is needed to identify whether both are contributors and, if so, the relative balance between them within each field.
— Research evaluation with ChatGPT: Is it age, country, length, or field biased?
(2411.09768 - Thelwall et al., 2024) in Conclusions