Existence of Loopholes in PaperBench Evaluation
Ascertain whether the PaperBench evaluation—including its rubrics and LLM-based judging process—contains exploitable loopholes that can lead to false negatives or false positives, given the large number of rubric nodes and the complexity of paper replication.
References
PaperBench rubrics have been carefully designed to avoid false negatives and false positives, but given the large number of nodes and the complexity of paper replication, we cannot yet rule out loopholes in our evaluation.
— PaperBench: Evaluating AI's Ability to Replicate AI Research
(2504.01848 - Starace et al., 2 Apr 2025) in Appendix A.3, Specification gaming and adversarial agents