Overview and Implications of Citation Index Coverage Study
This paper, authored by Martín-Martín et al., presents a comprehensive comparison of multiple citation data sources, including Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science (WoS), and OpenCitations’ COCI. The paper involves an extensive analysis of approximately 3,073,351 citations across 2,515 highly-cited documents from 252 subject categories, originated in 2006, thereby expanding the scope of previous coverage studies.
Methodology
The researchers employed a methodical approach due to access restrictions on full bibliographic databases. They focused on a seed sample from Google Scholar’s Classic Papers, ensuring these documents were highly cited across various disciplines. They matched citations across the six data sources, taking care to avoid false positives by using conservative matching criteria.
Key Results
- Google Scholar’s Dominance: Google Scholar displayed the most extensive coverage, capturing 88% of all possible citations from the seed documents, and found a significant proportion of citations detected by the other sources (89%-94%).
- Microsoft Academic and Scopus: Microsoft Academic followed Google Scholar, with 60% coverage, finding more citations than Scopus and WoS in numerous categories, particularly excelling in Humanities and Social Sciences albeit with gaps in Physics.
- Dimensions and COCI: Dimensions provided similar coverage to Scopus and exceeded WoS in several areas, though it, too, had gaps in some Humanities categories. COCI remained the least comprehensive, encompassing only 28% of the total citations.
Implications
Practical Applications
- Citation Analysis: Google Scholar remains a leading choice for comprehensive citation analysis, beneficial for obtaining broad citation counts, albeit without providing exhaustive lists of citing sources. Microsoft Academic serves as a robust alternative for more detailed citation studies and is an open-access option.
- Research and Accessibility: The general openness of data, as seen in Microsoft Academic and COCI, reflects a shift towards more accessible research infrastructures, which could inspire broader data sharing practices and reduce dependency on proprietary databases.
Theoretical Considerations
- Coverage Divergence: This paper underscores existing divergences in database coverage, which vary significantly across disciplines. This variability in data source comprehensiveness should inform bibliometric discussions and the selection of databases based on specific research needs.
- Future Directions: New forms of citation indexing, particularly those incorporating open access data like COCI, might catalyze advancements in bibliometric methodologies. The potential integration of citation links from free and open sources could pave the way for more equitable academic research ecosystems.
Conclusion
The research confirms Google Scholar's robust coverage across disciplines, highlighting its persistent place at the forefront of citation databases, although challenges such as fluctuating coverage remain. Microsoft Academic and Dimensions present viable alternatives to traditional resources like Scopus and WoS, with notable diversity in their subject-specific capabilities. Overall, this comparative paper provides critical insights for researchers into the nuanced landscape of citation metrics, encouraging informed choices for database selection tailored to specific scholarly inquiries.
Future bibliometric analyses would benefit from continued exploration into the evolving landscapes of these databases, particularly as more data sources commit to openness and interoperability, influencing both theoretical foundations and practical applications within computational and bibliometric research spheres.