- The paper demonstrates that Wikipedia citation patterns closely align with total scientific citations from journal reports, affirming its credibility.
- The study employs Perl-based regular expression matching to reveal over-citation of high-impact journals and field-specific biases.
- The findings highlight that structured citation templates and open access significantly influence citation frequency, informing future quality-assessment tools.
Scientific Citations in Wikipedia
Overview
Finn Arup Nielsen's paper, "Scientific citations in Wikipedia," offers an empirical analysis of the quality and consistency of scientific citations present in Wikipedia. This paper is rooted in the context of Wikipedia's increasing prominence as a global information resource and the ongoing scrutiny regarding the reliability of its content. The paper employs a quantitative approach to evaluate outbound links from Wikipedia articles to scientific journal articles, comparing these links against bibliometric statistics derived from the Journal Citation Reports (JCR).
Methodology
The research utilizes regular expression matching programs scripted in the Perl language to extract journal titles from the cite journal templates embedded in Wikipedia pages. This data extraction process was based on an XML dump file of the English Wikipedia database obtained on April 2, 2007. The paper compiled citation counts for individual journals and compared these figures with several metrics from the JCR 2005, including total citation counts, impact factors, and the number of articles.
Results
The paper identified 30,368 outbound citations from the cite journal templates. Leading the list were prominent journals such as Nature (787 citations), Science (669 citations), and New England Journal of Medicine (446 citations). Among astronomy journals, Astrophysical Journal (424 citations) and Astronomy & Astrophysics (154 citations) were notably cited. Medical journals such as The Lancet (268 citations) and JAMA (217 citations) also featured prominently.
A key finding in the correlational analysis showed high agreement between Wikipedia citation patterns and the JCR's total citation counts for journals. Notably, the correlation was weaker for the JCR impact factors and the number of articles per journal. The strongest correlations were obtained by multiplying the total number of citations by the impact factor, indicating that Wikipedia authors might overcite high-impact journals compared to the overall scientific literature.
Discussion
The research highlights several implications:
- Reliability of Citations: The strong correlation between Wikipedia citations and total citations in the JCR suggests that Wikipedia can be a credible information organizer, especially for science-related content.
- Field Biases: Astronomy journals received disproportionately high citations, with a notable effort in Australian botany, as illustrated by journals like Nuytsia (101 citations). Conversely, internet-related journals received fewer citations, contradicting the notion that Wikipedia would show substantial bias towards topics favored by the "Internet-savvy" demographic.
- Access to Free Articles: Freely accessible journals like the BMJ appeared to gain more citations on Wikipedia, likely due to the open-access nature of their articles.
- Quality Assessment: The methodology proposed, which includes structured citation markup and citation template utilization, facilitates the assessment of Wikipedia article quality based on outbound citation patterns.
Future Directions
The paper indicates several avenues for further research and development:
- Enhanced Citation Tools: The incorporation of reference management tools like Zotero, which supports Wikipedia citation handling, suggests that the structure and number of scientific citations in Wikipedia will continue to grow. Future research could explore how these tools contribute to citation accuracy and ease of reference.
- Automated Quality Metrics: Developing algorithms that more accurately correlate Wikipedia citation data with traditional bibliometric measures could enhance the ability to assess and ensure the quality of Wikipedia articles.
- Field-Specific Studies: More granular studies on specific scientific fields and their citation patterns within Wikipedia could provide deeper insights into how certain disciplines are represented and referenced.
Conclusion
Nielsen's investigation into scientific citations on Wikipedia offers substantial quantitative evidence that enhances confidence in Wikipedia's role as an information resource for scientific content. While it reveals certain biases and areas of overcitation, the overall alignment with established scientific citation patterns supports the use of Wikipedia for background reading and information organization. The paper underscores the importance of structured citation practices, which will likely benefit future researchers seeking well-organized and credible references.