Can LLMs benefit from research evaluation training data?

Investigate whether fine-tuning large language models with domain-specific research evaluation training data (e.g., article–score pairs or review judgements) improves their accuracy for article quality assessment, and quantify any gains across fields.

Background

While LLMs are broadly pre-trained on massive corpora, it is unknown if targeted fine-tuning with evaluation-specific data enhances their performance for research-quality tasks.

Answering this would guide whether collecting and curating such datasets is worthwhile and how they should be structured.

References

One unknown at the time of writing (early 2024) is whether LLMs can benefit from specific research evaluation training data.

— Quantitative Methods in Research Evaluation Citation Indicators, Altmetrics, and Artificial Intelligence (2407.00135 - Thelwall, 28 Jun 2024) in Section 14.1

Can LLMs benefit from research evaluation training data?

Sponsor

Background

References

Related Problems