Papers
Topics
Authors
Recent
2000 character limit reached

Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment (2109.09774v1)

Published 20 Sep 2021 in cs.DL and cs.LG

Abstract: In this paper we revisit the 2014 NeurIPS experiment that examined inconsistency in conference peer review. We determine that 50\% of the variation in reviewer quality scores was subjective in origin. Further, with seven years passing since the experiment we find that for \emph{accepted} papers, there is no correlation between quality scores and impact of the paper as measured as a function of citation count. We trace the fate of rejected papers, recovering where these papers were eventually published. For these papers we find a correlation between quality scores and impact. We conclude that the reviewing process for the 2014 conference was good for identifying poor papers, but poor for identifying good papers. We give some suggestions for improving the reviewing process but also warn against removing the subjective element. Finally, we suggest that the real conclusion of the experiment is that the community should place less onus on the notion of `top-tier conference publications' when assessing the quality of individual researchers. For NeurIPS 2021, the PCs are repeating the experiment, as well as conducting new ones.

Citations (35)

Summary

  • The paper reanalyzes the 2014 NeurIPS experiment to reveal that about 50% of reviewer score variance is driven by subjectivity.
  • The paper finds that calibrated review scores for accepted papers do not correlate with subsequent citation impact, questioning review efficacy.
  • The paper shows that many rejected submissions with higher quality scores later achieve significant citation impact, indicating missed opportunities.

Re-evaluating the 2014 NeurIPS Experiment in Peer Review Inconsistencies

The paper "Inconsistency in Conference Peer Review: Revisiting the 2014 NeurIPS Experiment" by Corinna Cortes and Neil D. Lawrence presents an insightful reanalysis of the 2014 NeurIPS experiment, which scrutinizes the inconsistencies present in the peer review process of scientific conferences, specifically focusing on NeurIPS. By exploring the subjectivity of reviewer assessments and the correlation between reviewer scores and subsequent citation impact, the research sheds light on the efficacy of peer review processes in identifying high-impact scientific contributions.

Overview of the Study

In 2014, the NeurIPS conference conducted an experiment where roughly 10% of submitted papers were reviewed by two separate committees to gauge the consistency between their acceptance decisions. The initial findings revealed substantial inconsistency, with approximately 25% of the papers receiving divergent decisions between the committees. The paper adapts this framework to perform a comprehensive analysis to identify the origin and impact of these inconsistencies.

Key Findings and Methodology

  1. Subjectivity in Reviewer Scores: The analysis identifies that about 50% of the variance in reviewer scores is attributable to the subjective elements inherent in the reviewing process. Using a Gaussian process model to calibrate reviewer scores, it was illustrated that the variance in subjective opinion significantly contributes to inconsistency, which stands as the crux of the paper's findings.
  2. Impact Correlation with Reviewer Scores: For accepted papers, the paper finds no significant correlation between the calibrated quality scores provided by reviewers and the eventual citation impact of these papers. This suggests that high-quality scoring in reviews does not reliably predict a paper’s ongoing influence within the academic community.
  3. Fate of Rejected Papers: Interestingly, the paper reveals a correlation between the quality scores of rejected papers and their eventual citation impact, indicating that reviewers are more adept at recognizing weaker submissions. Many rejected papers were later published in prestigious venues, which highlights potential missed opportunities for significant contributions within the initial review process.
  4. Simulation Studies: A simulation model was introduced to demonstrate the effects of subjectivity on decision consistency, supporting the hypothesis that increased subjectivity leads to lower review consistency.

Implications and Future Directions

The findings reaffirm the challenges associated with the peer review process—particularly the difficulty in accurately predicting a paper's long-term impact based solely on reviewer quality scores. The paper advocates for a reform in scoring methodologies, suggesting clearer, multi-dimensional criteria for assessing submissions. By segregating scores into distinct categories (e.g., originality, clarity, rigour, and significance), the process might capture broader aspects of academic contributions and enhance decision consistency.

Additionally, the paper raises important considerations about the role of top-tier conference publications in evaluating researcher quality, cautioning against over-reliance on these metrics due to potential inconsistencies in peer review.

This reassessment of the NeurIPS experiment underscores the ongoing need for innovations in conference reviewing processes that better align with the diverse aims of scientific inquiry. As the machine learning community continues to expand, establishing robust review methodologies could aid in fostering both equitable and impactful dissemination of research advancements. The impending repetition of this experiment by 2021 NeurIPS Program Chairs hints at a promising trajectory toward refining peer review evaluations in the field.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Youtube Logo Streamline Icon: https://streamlinehq.com