Dice Question Streamline Icon: https://streamlinehq.com

Ground truth for research interest similarity

Determine the true research interest similarity of authors and of papers, which is currently treated as an abstract, unobserved concept, to enable definitive validation and calibration of similarity measures derived from citation networks and random-walk transition probabilities.

Information Square Streamline Icon: https://streamlinehq.com

Background

The paper introduces transition probability (TP)-based measures on citation networks to approximate similarity between scholarly entities and evaluates them through predictive tasks. While these measures provide interpretable and operational definitions of similarity, the authors emphasize that the concept of "true" research interest similarity is not directly observable, necessitating empirical proxies and experimental designs sensitive to different similarity ranges.

This unresolved issue motivates their testing scenarios—collaboration prediction and disciplinary co-classification—as indirect evaluations of TP-based similarity metrics. Establishing a ground truth for research interest similarity would enable more rigorous validation of such measures and clearer benchmarking across methods.

References

Continuing with the discussion of the research design, it is important to emphasize that we don't know the true research interest similarity of authors and papers, it is an abstract concept we wish to approximate.

Measuring Research Interest Similarity with Transition Probabilities (2409.18240 - Varga et al., 26 Sep 2024) in Section 4, Experimental Data and Design for Empirical Tests