Mechanism behind resolution-dependent advantage of citation-based clustering over text-based clustering

Determine the mechanism by which increasing the resolution parameter of the Leiden community detection algorithm leads to higher clustering effectiveness of citation similarity networks relative to text similarity networks in science maps, as observed through increased accuracy and consistent positive changes in ratio-based metrics (e.g., rPurity and rICC).

Background

The paper evaluates how well different topic categories (MeSH branches) are represented in science maps created from citation and text similarity networks, using metrics such as Purity, ICC, and their citation-to-text ratios (rPurity and rICC). The authors find that higher Resolution (smaller clusters) increases the relative effectiveness of citation-based clustering compared to text-based clustering, and they confirm a similar trend using an external visualization method by Ahlgren et al.

Despite consistent empirical evidence that increasing Resolution favors citation-based clustering over text-based clustering, the authors explicitly state that the causal mechanism driving this dependency is unknown. Understanding this mechanism is important for guiding users in selecting clustering configurations and for improving science map construction methods.

References

Using their data and visualization method, we found that the accuracy of citation networks relative to text networks increases as the Resolution value increases. This is in line with our results. Unfortunately, we do not know the mechanism behind this dependency.

— Which topics are best represented by science maps? An analysis of clustering effectiveness for citation and text similarity networks (2406.06454 - Bascur et al., 2024) in Subsection 6.2 (Discussion: Which topic categories have higher clustering effectiveness in citation similarity networks than in text similarity networks, and vice versa?)

Mechanism behind resolution-dependent advantage of citation-based clustering over text-based clustering

Background

References

Related Problems