Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

An analysis of the effects of sharing research data, code, and preprints on citations (2404.16171v2)

Published 24 Apr 2024 in cs.DL

Abstract: Calls to make scientific research more open have gained traction with a range of societal stakeholders. Open Science practices include but are not limited to the early sharing of results via preprints and openly sharing outputs such as data and code to make research more reproducible and extensible. Existing evidence shows that adopting Open Science practices has effects in several domains. In this study, we investigate whether adopting one or more Open Science practices leads to significantly higher citations for an associated publication, which is one form of academic impact. We use a novel dataset known as Open Science Indicators, produced by PLOS and DataSeer, which includes all PLOS publications from 2018 to 2023 as well as a comparison group sampled from the PMC Open Access Subset. In total, we analyze circa 122'000 publications. We calculate publication and author-level citation indicators and use a broad set of control variables to isolate the effect of Open Science Indicators on received citations. We show that Open Science practices are adopted to different degrees across scientific disciplines. We find that the early release of a publication as a preprint correlates with a significant positive citation advantage of about 20.2% on average. We also find that sharing data in an online repository correlates with a smaller yet still positive citation advantage of 4.3% on average. However, we do not find a significant citation advantage for sharing code. Further research is needed on additional or alternative measures of impact beyond citations. Our results are likely to be of interest to researchers, as well as publishers, research funders, and policymakers.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Giovanni Colavizza (36 papers)
  2. Lauren Cadwallader (1 paper)
  3. Marcel LaFlamme (1 paper)
  4. Grégory Dozot (1 paper)
  5. Stéphane Lecorney (1 paper)
  6. Daniel Rappo (1 paper)
  7. Iain Hrynaszkiewicz (2 papers)
Citations (4)

Summary

  • The paper demonstrates a citation benefit, showing that preprints yield a roughly 20.2% increase in citations.
  • The paper uses linear regression on over 122,000 PLOS publications to evaluate the impact of Open Science practices while controlling for confounding factors.
  • The paper finds that data sharing offers a modest 4.3% citation boost, whereas code sharing does not yield a significant advantage.

Analysis of the Impact of Sharing Research Data, Code, and Preprints on Citations

The paper "An analysis of the effects of sharing research data, code, and preprints on citations" presents an empirical investigation into the emerging trends of Open Science practices and their correlation with the academic impact, as measured by citation counts. Utilizing a unique dataset known as Open Science Indicators (OSI), the authors analyzed approximately 122,000 publications, predominantly from PLOS journals, to assess whether Open Science practices translate into higher citations.

Methodology

Open Science practices, such as early dissemination of findings through preprints and the open sharing of data and code, have been advocated to enhance research visibility and reproducibility. This paper leverages PLOS's OSI dataset, combined with a control group from the PMC Open Access Subset. Through a linear regression framework, the article evaluates the effect of preprints, data sharing, and code sharing on the citation count of publications, while controlling for confounders like the number of authors, references, and journal-specific variables.

Key Findings

The findings of the paper reveal a statistically significant positive impact of preprint publication and data sharing on citation counts. Specifically:

  • Preprints: Publications released as preprints demonstrate a citation advantage of approximately 20.2%. This finding supports existing literature that recognizes the role of preprints in enhancing the visibility and dissemination of research during the peer-review process.
  • Data Sharing: Articles sharing data in repositories enjoy a smaller yet positive citation increment of about 4.3%. While prior studies have identified significant benefits tied to data sharing, this paper underscores the impact's dependency on the method of sharing, with repositories viewed more favorably than other modes.
  • Code Sharing: Surprisingly, the paper does not find a substantial citation benefit from sharing code, a divergence from some previous findings. This may suggest discipline-specific preferences or intrinsic differences in how code sharing is utilized and cited relative to data sharing.

Implications and Future Directions

The reported implications extend to researchers, publishers, and policymakers. For researchers, the paper emphasizes the tangible benefits of adopting Open Science practices, particularly those related to preprints and data repositories, which can substantially influence academic reputation and visibility. For institutional policymakers, the findings call for support structures that encourage these practices and integrate them into research assessment frameworks.

The authors acknowledge certain limitations, such as the dataset's PLOS-centric nature, which may affect generalizability. Furthermore, the observational design precludes causal inferences, hence necessitating further research. Future work should explore the temporal trends in Open Science adoption, potential biases across disciplines, and alternative impact measures beyond citations.

Conclusion

Overall, this paper contributes to the scholarly understanding of Open Science's quantifiable benefits, specifically regarding citation advantages associated with preprints and data sharing. As the landscape of scholarly communication continues to evolve, ongoing monitoring of Open Science practices will be integral to shaping science policy and promoting a more transparent, accessible research ecosystem. By encouraging wider adoption and assessing varied impacts, we can further the discussion on how Open Science can effectively reshape academic and societal engagement with research.