The Google Scholar Experiment: how to index false papers and manipulate bibliometric indicators (1309.2413v1)

Published 10 Sep 2013 in cs.DL

Abstract: Google Scholar has been well received by the research community. Its promises of free, universal and easy access to scientific literature as well as the perception that it covers better than other traditional multidisciplinary databases the areas of the Social Sciences and the Humanities have contributed to the quick expansion of Google Scholar Citations and Google Scholar Metrics: two new bibliometric products that offer citation data at the individual level and at journal level. In this paper we show the results of a experiment undertaken to analyze Google Scholar's capacity to detect citation counting manipulation. For this, six documents were uploaded to an institutional web domain authored by a false researcher and referencing all the publications of the members of the EC3 research group at the University of Granada. The detection of Google Scholar of these papers outburst the citations included in the Google Scholar Citations profiles of the authors. We discuss the effects of such outburst and how it could affect the future development of such products not only at individual level but also at journal level, especially if Google Scholar persists with its lack of transparency.

PDF Abstract

An Analysis of Citation Manipulation in Google Scholar

The paper, titled "The Google Scholar Experiment: how to index false papers and manipulate bibliometric indicators" by Emilio Delgado López-Cózar, Nicolás Robinson-García, and Daniel Torres-Salinas, investigates the vulnerabilities of Google Scholar (GS) and its associated metrics to manipulation through false document submissions. This paper provides a crucial commentary on the limitations inherent in the evolving landscape of academic research evaluation tools, particularly those relying on bibliometric indicators such as citations.

Methodology and Experimentation

The authors conducted an experimental paper to determine the susceptibility of Google Scholar Citations to manipulation. They uploaded six falsified documents to an institutional web domain, attributing authorship to a fictitious researcher. These documents referenced all publications by members of the EC3 research group at the University of Granada. The result was a significant artificial inflation of citation counts within Google Scholar Citations profiles for these researchers, demonstrating a basic yet effective approach to inflating individual bibliometric profiles.

Key Findings

The experiment proved that Google Scholar failed to detect the manipulation, resulting in a notable increase in the citation numbers and h-index values of the authors involved. Numerical results were remarkable, as the most significant increase was a 7.25-fold rise in citation count for the least cited author before the manipulation. Moreover, the h-index, which is often used as a benchmark for academic influence, saw a significant increase for these authors, thereby impacting evaluations of individual and journal-level bibliometric performances had these citations remained unchecked.

Discussion and Implications

The implications of these findings are significant for the academic community, particularly in disciplines reliant on bibliometric evaluation for career advancement and funding. The paper highlights the ease with which GS and its derivatives, GS Citations and GS Metrics, can be exploited due to their lack of transparency and control over the types of content they index. Such vulnerabilities present ethical challenges and potential distortions in the evaluation of scholarly works.

This work raises concerns about the reliability of using Google Scholar-derived metrics for critical academic decisions. The absence of adequate safeguards against fraudulent practices poses a risk to the integrity of academic evaluation processes. The authors suggest the need for stricter controls and more transparency in the algorithmic processes that support citation indexing in GS, alongside better monitoring tools to avert such manipulations.

Future Directions

For future research, the paper urges the exploration of methods to enhance the reliability of bibliometric data as provided by GS and similar platforms. The potential for integration of more sophisticated detection algorithms, capable of identifying non-genuine citations, could serve as a measure to counteract malicious attempts at inflating bibliometric indicators. Moreover, engaging with the broader issue of academic ethics, the paper underscores the importance of fostering a culture that prioritizes integrity and accountability in research evaluation processes.

Conclusion

This paper demonstrates the capabilities and limitations of Google Scholar in handling citation manipulation, urging caution in relying solely on computerized bibliometric indicators for scholarly assessment. The findings advocate for enhanced transparency and the development of robust systems to uphold the integrity of institutional and personal academic metrics, which are critical in shaping the future landscape of academic recognition and credibility.

PDF Markdown Bookmark Chat (Pro)

Authors (3)

Citations (246)

View on Semantic Scholar