Papers
Topics
Authors
Recent
2000 character limit reached

Estimating the Causal Effect of Early ArXiving on Paper Acceptance (2306.13891v2)

Published 24 Jun 2023 in cs.CL

Abstract: What is the effect of releasing a preprint of a paper before it is submitted for peer review? No randomized controlled trial has been conducted, so we turn to observational data to answer this question. We use data from the ICLR conference (2018--2022) and apply methods from causal inference to estimate the effect of arXiving a paper before the reviewing period (early arXiving) on its acceptance to the conference. Adjusting for confounders such as topic, authors, and quality, we may estimate the causal effect. However, since quality is a challenging construct to estimate, we use the negative outcome control method, using paper citation count as a control variable to debias the quality confounding effect. Our results suggest that early arXiving may have a small effect on a paper's chances of acceptance. However, this effect (when existing) does not differ significantly across different groups of authors, as grouped by author citation count and institute rank. This suggests that early arXiving does not provide an advantage to any particular group.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Susan Athey and Guido W Imbens. 2006. Identification and inference in nonlinear difference-in-differences models. Econometrica, 74(2):431–497.
  2. Longformer: The long-document transformer. arXiv:2004.05150.
  3. David Card and Alan B Krueger. 1994. Minimum wages and employment: A case study of the fast-food industry in new jersey and pennsylvania. The American Economic Review, pages 772–793.
  4. Association between author metadata and acceptance: A feature-rich, matched observational study of a corpus of iclr submissions between 2017-2022. arXiv preprint arXiv:2211.15849.
  5. SCHOLARLY: Simple access to Google Scholar authors and citation using Python.
  6. SPECTER: Document-level Representation Learning using Citation-informed Transformers. In ACL.
  7. ACL policies and guidelines for submission, review and citation.
  8. Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond. Transactions of the Association for Computational Linguistics, 10:1138–1158.
  9. Citation count analysis for papers with preprints. arXiv preprint arXiv:1805.05238.
  10. Guido W Imbens and Donald B Rubin. 2015. Causal inference in statistics, social, and biomedical sciences. Cambridge University Press.
  11. The semantic scholar open data platform. ArXiv, abs/2301.10140.
  12. Negative controls: A tool for detecting confounding and bias in observational studies. Epidemiology.
  13. Roberta: A robustly optimized bert pretraining approach. ArXiv, abs/1907.11692.
  14. Samuel Madden and David J. DeWitt. 2006. Impact of double-blind reviewing on sigmod publication rates. SIGMOD Rec., 35:29–32.
  15. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems, volume 14. MIT Press.
  16. Judea Pearl. 1995. Causal diagrams for empirical research. Biometrika, 82(4):669–688.
  17. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.
  18. Paul R. Rosenbaum. 1989. Optimal matching for observational studies. Journal of the American Statistical Association, 84(408):1024–1032.
  19. Paul R Rosenbaum and Donald B Rubin. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1):41–55.
  20. Donald B Rubin. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66(5):688.
  21. Donald B Rubin. 1979. Using multivariate matched sampling and regression adjustment to control bias in observational studies. Journal of the American Statistical Association, 74(366a):318–328.
  22. Donald B Rubin. 2005. Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100(469):322–331.
  23. Effects of ambient air pollution on nonelderly asthma hospital admissions in seattle, washington, 1987-1994. Epidemiology, 10(1):23–30.
  24. Richard T. Snodgrass. 2006. Single- versus double-blind reviewing: an analysis of the literature. SIGMOD Rec., 35:8–21.
  25. On negative outcome control of unobserved confounding as a generalization of difference-in-differences. Statistical science : a review journal of the Institute of Mathematical Statistics.
  26. Single versus double blind reviewing at wsdm 2017. ArXiv, abs/1702.00502.
  27. Predicting a scientific community’s response to an article. In Proceedings of the 2011 conference on empirical methods in natural language processing, pages 594–604.
  28. Matching one sample according to two criteria in observational studies. Journal of the American Statistical Association, pages 1–12.
  29. Some reflections on drawing causal inference using textual data: Parallels between human subjects and organized texts. In First Conference on Causal Learning and Reasoning.
  30. Investigating fairness disparities in peer review: A language model enhanced approach. ArXiv, abs/2211.06398.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Sign up for free to view the 2 tweets with 27 likes about this paper.