Turning the tables in citation analysis one more time: Principles for comparing sets of documents (1101.3863v2)

Published 20 Jan 2011 in cs.DL and physics.soc-ph

Abstract: We submit newly developed citation impact indicators based not on arithmetic averages of citations but on percentile ranks. Citation distributions are-as a rule-highly skewed and should not be arithmetically averaged. With percentile ranks, the citation of each paper is rated in terms of its percentile in the citation distribution. The percentile ranks approach allows for the formulation of a more abstract indicator scheme that can be used to organize and/or schematize different impact indicators according to three degrees of freedom: the selection of the reference sets, the evaluation criteria, and the choice of whether or not to define the publication sets as independent. Bibliometric data of seven principal investigators (PIs) of the Academic Medical Center of the University of Amsterdam is used as an exemplary data set. We demonstrate that the proposed indicators [R(6), R(100), R(6,k), R(100,k)] are an improvement of averages-based indicators because one can account for the shape of the distributions of citations over papers.

Citations (177)

View on Semantic Scholar

Summary

The paper critiques traditional citation analysis methods, arguing against arithmetic averages of normalized data and proposing percentile ranks as a more robust alternative.
It highlights issues with normalizing citations using broad subject categories and advocates for normalizing each citation against a reference set before aggregation.
The study introduces new indicators (R(6), R(100), R(6,k), R(100,k)) based on percentile ranks to handle skewed citation data and enable flexible comparisons of document sets.

Principles for Comparing Sets of Documents: A Critique and Proposal of Percentile Rank Citation Indicators

The paper "Turning the Tables in Citation Analysis One More Time: Principles for Comparing Sets of Documents" by Leydesdorff, Bornmann, Mutz, and Opthof emerges as a significant critique of mainstream methods employed in citation analysis and proposes an alternative mechanism for evaluating citation impacts through percentile ranks. This scholarly work challenges traditional reliance on arithmetic averages for citation impact indicators and argues for a shift towards non-parametric statistics to better account for the skewness inherent in citation distributions.

Critique of Existing Indicators

The authors meticulously critique established citation indices such as the crown indicator (CPP/FCSm) from the CWTS in Leiden, MNCR from ECOOM, and MOCR used by ISSRU. The central argument is that normalizing citations by separate aggregation of numerators and denominators prior to averaging—a method traditionally employed—violates mathematical principles and inflates error margins. A stronger methodological approach, as proposed, is to normalize each citation against a reference set before computing averages. This allows for a consistent and robust indicator with reliable error terms.

Beyond methodological flaws, the authors criticize the use of subject categories defined by ISI for normalizing citations, as these categories often group heterogeneous sets inaccurately. Such aggregation, though convenient, compromises the integrity of the normalization process.

Percentile Rank Approach

In response to these shortcomings, the paper advances percentile rank indicators as a viable solution. Citation of each paper is rated within percentile ranks, allowing researchers to avoid pitfalls associated with averaging highly skewed citation data. The percentile rank method focuses not only on citation rates but encompasses evaluation schemes targeting top-cited papers.

Four innovative indicators, namely R(6), R(100), R(6,k), and R(100,k), are introduced. These indicators leverage percentile ranks rather than conventional averages, offering analysts flexibility in selecting reference sets, evaluation criteria, and defining whether publication sets are independent. The methodology is demonstrated using bibliometric data from seven principal investigators at the Academic Medical Center of the University of Amsterdam.

Implications and Discussion

The implications of adopting percentile rank indicators are multifold. The integration of non-parametric statistics circumvents the issues presented by skewed data and offers greater fidelity in citation analysis. Such measures facilitate accurate cross-comparisons of document sets and can significantly aid in policy decision-making and institutional evaluations by distinguishing between "good" and "excellent" research outputs.

This approach compellingly points to the need for robust statistical frameworks in citation analysis. By normalizing relative frequencies over aggregated samples, R(100,k) and R(6,k) indicators enable comparisons across sets of differing sizes and provide insight into productivity impacts on citation rates. The discussion opens avenues for further exploration, particularly in integrating advanced statistical procedures into automated systems such as SPSS, ensuring that citation impact analyses become more consistent and effective.

Concluding Remarks

The paper is a valuable contribution to the ongoing debate on the most efficacious methods for bibliometric analysis. While the proposed percentile rank indicators do not claim to be universally applicable across all domains, they offer a sophisticated toolset for researchers interested in citation analysis. Future work could explore enhancements in percentile classification schemes or integration into broader academic databases, pushing the boundaries of citation performance evaluation.

In summary, the authors advocate for a reevaluation of citation impact measurement frameworks, challenging the status quo with a well-argued, statistically grounded approach that holds promise for better accuracy and reliability in bibliometric analysis.