Accountability for Errors When Using LLMs in Peer Review and Evaluation

Establish responsibility and accountability frameworks specifying who should be held liable for errors made by large language models when they are used in tasks such as peer review, manuscript assessment, or proposal evaluation.

Background

The paper cautions against delegating research assessment tasks (e.g., peer review, proposal evaluation) to LLMs, citing both ethical concerns and practical issues such as bias, fairness, and public trust. A key unresolved issue is accountability for mistakes made by these systems.

Clarifying responsibility is essential for maintaining integrity and fairness in research evaluation processes if AI tools are used in any capacity.

References

While we can hold people responsible for misinterpreting a proposal or an article, it is unclear who should be held responsible if the machine makes an error.

— What is the Role of Large Language Models in the Evolution of Astronomy Research? (2409.20252 - Fouesneau et al., 30 Sep 2024) in Section: Ethical and Legal Concerns — Research-specific Concerns

Accountability for Errors When Using LLMs in Peer Review and Evaluation

Background

References

Related Problems