Dice Question Streamline Icon: https://streamlinehq.com

Extent of citation-related bias within UoAs in ChatGPT research quality scores

Ascertain the magnitude and characteristics of within-Unit-of-Assessment citation-related biases in ChatGPT-generated research quality scores for REF2021 journal articles, beyond the known between-UoA biases, to establish their impact on indicator validation.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper uses ChatGPT scores as a primary gold standard for article-level research quality due to the absence of public individual REF scores. Prior work suggests ChatGPT exhibits field-level (between-UoA) biases that correlate with citation intensity, raising concerns about fairness in validation.

While these between-UoA biases are acknowledged, the authors point out that potential within-UoA citation biases may also exist and could influence correlations with citation-based indicators. Quantifying these within-UoA biases is necessary to interpret validation results and refine methodologies.

References

The ChatGPT quality scores also have field biases relative to the REF, tending to give higher scores to some UoAs than others, with the differences seeming to favour more highly cited fields (Thelwall & Kurt, 2024). Thus, whilst the known between-UoA biases do not directly influence the within-UoA correlations reported here, their existence suggests that there may well also be within-UoA citation biases in the ChatGPT scores, although their extent is unknown.

Is OpenAlex Suitable for Research Quality Evaluation and Which Citation Indicator is Best? (2502.18427 - Thelwall et al., 25 Feb 2025) in Discussion, Limitations