Searching for superspreaders of information in real-world social media

Published 8 May 2014 in physics.soc-ph and cs.SI | (1405.1790v2)

Abstract: A number of predictors have been suggested to detect the most influential spreaders of information in online social media across various domains such as Twitter or Facebook. In particular, degree, PageRank, k-core and other centralities have been adopted to rank the spreading capability of users in information dissemination media. So far, validation of the proposed predictors has been done by simulating the spreading dynamics rather than following real information flow in social networks. Consequently, only model-dependent contradictory results have been achieved so far for the best predictor. Here, we address this issue directly. We search for influential spreaders by following the real spreading dynamics in a wide range of networks. We find that the widely-used degree and PageRank fail in ranking users' influence. We find that the best spreaders are consistently located in the k-core across dissimilar social platforms such as Twitter, Facebook, Livejournal and scientific publishing in the American Physical Society. Furthermore, when the complete global network structure is unavailable, we find that the sum of the nearest neighbors' degree is a reliable local proxy for user's influence. Our analysis provides practical instructions for optimal design of strategies for "viral" information dissemination in relevant applications.

Abstract PDF Upgrade to Chat

Authors (5)

Citations (362)

View on Semantic Scholar

Summary

The paper shows that k-core centrality is a more reliable predictor of superspreaders in real-world networks than traditional metrics.
It employs comprehensive datasets and BFS techniques to trace actual information cascades across platforms like Twitter and Facebook.
The study proposes using the sum of nearest neighbors' degrees as a local proxy when full network data is unavailable.

The paper authored by Sen Pei, Lev Muchnik, Jose S. Andrade, Jr., Zhiming Zheng, and Hernan A. Makse offers an extensive empirical examination of the identification of influential spreaders—or superspreaders—of information within real-world social networks. Commonly applied metrics such as degree, PageRank, and betweenness centrality have dominated the conversation surrounding influencer detection within online platforms like Twitter and Facebook. However, the research presented here challenges the efficacy of these traditional measures, particularly highlighting the k-core centrality's superior predictive performance for identifying influential nodes within the network topology.

Core Findings

The primary critique offered by this paper is the reliance on simulated models for validating the predictive quality of measures like degree and PageRank, which do not account for the real-world diffusion dynamics. Such reliance leads to model-dependent results. By tracing actual information flows within networks, the authors argue that k-core centrality more reliably identifies influential spreaders than either degree or PageRank. This finding remains consistent across diverse platforms, including social networks like Twitter, Facebook, Livejournal, and scientific publishing within the American Physical Society.

A notable empirical finding is that nodes demonstrating the highest k-core scores showed a larger influence, as observed via significant information cascades, qualifying them as superspreaders. Moreover, when full network data is unavailable, the sum of the nearest neighbors' degrees serves as a reliable local proxy for estimating a user's influence, aligning closely with the performance of the k-core approach.

Methodology

The authors undertook a real-dynamics study, collecting comprehensive datasets such as Livejournal posts, Facebook wall interactions, Twitter mention networks, and APS journal citations. Each dataset offered extensive social graph structures and complete diffusion records within specified periods. The breadth-first-search (BFS) technique was employed to reconstruct information propagation pathways, quantifying influence by the size of the node's influence region.

Practical and Theoretical Implications

The findings present substantial implications: they challenge the prevailing notion that degree and PageRank are suitable for identifying key influencers, especially in real social network applications where the network structure is partially known or wholly unavailable. The recommendation is clear—adopt k-core centrality where possible and k-sum as a local alternative to enhance information dissemination strategies. This shift could directly impact strategies across domains from marketing to public health, where superspreaders play critical roles.

Additionally, this study's insights underscore the variability and complexity inherent in real-world diffusion processes compared to simplified stochastic models. The superiority of k-core over degree and PageRank suggests a need for more nuanced network analysis approaches that account for hierarchical network structures in identifying key dissemination nodes.

Future Directions

Future investigations may focus on several key areas: further empirical validation of k-core centrality across even broader types of networks, integration with machine learning approaches for automated detection of superspreaders, and a deeper exploration of k-core structure's interaction with community dynamics and modularity within networks. Such advancements could refine influencer strategies further, tailoring more effective viral marketing, information flow, and contagion control across diverse societal sectors.

In conclusion, this paper contributes significantly to the toolkit available for identifying influential nodes in social networks. Its empirical approach and cross-platform analysis herald a shift from traditional, simplistic models towards more intricate, hierarchically-aware approaches to managing and understanding information spread.

Markdown Report Issue