Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
121 tokens/sec
GPT-4o
9 tokens/sec
Gemini 2.5 Pro Pro
47 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Information diffusion assumptions can distort our understanding of social network dynamics (2410.21554v3)

Published 28 Oct 2024 in cs.SI

Abstract: To analyze the flow of information online, experts often rely on platform-provided data from social media companies, which typically attribute all resharing actions to an original poster. This obscures the true dynamics of how information spreads online, as users can be exposed to content in various ways. While most researchers analyze data as it is provided by the platform and overlook this issue, some attempt to infer the structure of these information cascades. However, the absence of ground truth about actual diffusion cascades makes verifying the efficacy of these efforts impossible. This study investigates the implications of the common practice of ignoring reconstruction all together. Two case studies involving data from Twitter and Bluesky reveal that reconstructing cascades significantly alters the identification of influential users, therefore affecting downstream analyses in general. We also propose a novel reconstruction approach that allows us to evaluate the effects of different assumptions made during the cascade inference procedure. Analysis of the diffusion of over 40,000 true and false news stories on Twitter reveals that the assumptions made during the reconstruction procedure drastically distort both microscopic and macroscopic properties of cascade networks. This work highlights the challenges of studying information spreading processes on complex networks and has significant implications for the broader study of digital platforms.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (83)
  1. Culture and Society in the Digital Age. Information, 12(2):68, 2021. URL https://doi.org/10.3390/info12020068.
  2. Big data: A review. In 2013 International Conference on Collaboration Technologies and Systems, pages 42–47. IEEE, 2013. URL https://doi.org/10.1109/CTS.2013.6567202.
  3. The World’s Technological Capacity to Store, Communicate, and Compute Information. Science, 332(6025):60–65, 2011. URL https://doi.org/10.1126/science.1200970.
  4. Integrating explanation and prediction in computational social science. Nature, 595:181–188, July 2021. doi: 10.1038/s41586-021-03659-0. URL https://doi.org/10.1038/s41586-021-03659-0.
  5. Computational social science: Obstacles and opportunities. Science, 369(6507):1060–1062, 2020. URL https://doi.org/10.1126/science.aaz8170.
  6. Matthew J. Salganik. Bit by Bit: Social Research in the Digital Age. Princeton University Press, Princeton, NJ, 2018.
  7. Computational Social Science. Science, 323(5915):721–723, February 2009. doi: 10.1126/science.1167742. URL https://doi.org/10.1126/science.1167742.
  8. Duncan J. Watts. A twenty-first century science. Nature, 445:489, 2007. URL https://doi.org/10.1038/445489a.
  9. A tutorial for using twitter data in the social sciences: Data collection, preparation, and analysis. SSRN, 2016. URL https://dx.doi.org/10.2139/ssrn.2710146.
  10. Enhancing the ethics of user-sourced online data collection and sharing. Nat Comput Sci, 3:660–664, 2023. URL https://doi.org/10.1038/s43588-023-00490-7.
  11. Platform-controlled social media APIs threaten open science. Nat Hum Behav, 7:2054–2057, 2023. URL https://doi.org/10.1038/s41562-023-01750-2.
  12. What social media told us in the time of COVID-19: a scoping review. Lancet Digital Health, 3(3):e175–e194, March 2021. URL https://doi.org/10.1016/S2589-7500(20)30315-0.
  13. From “Infodemics” to Health Promotion: A Novel Framework for the Role of Social Media in Public Health. American Journal of Public Health, 2020. URL https://ajph.aphapublications.org/doi/abs/10.2105/AJPH.2020.305746.
  14. Social media in public health. British Medical Bulletin, 108(1):5–24, December 2013. URL https://doi.org/10.1093/bmb/ldt028.
  15. Mark Dredze. How Social Media Will Change Public Health. IEEE Intelligent Systems, 27(4):81–84, August 2012. URL https://doi.org/10.1109/MIS.2012.76.
  16. Social media and democracy: The state of the field, prospects for reform. Cambridge University Press, 2020.
  17. Social Media, Political Polarization, and Political Disinformation: A Review of the Scientific Literature. SSRN, 2018. URL https://doi.org/10.2139/ssrn.3144139.
  18. The science of fake news. Science, 359(6380):1094–1096, 2018. URL https://doi.org/10.1126/science.aao2998.
  19. Social media and political communication: a social media analytics framework. Soc Netw Anal Min, 3(4):1277–1291, 2013. URL https://doi.org/10.1007/s13278-012-0079-3.
  20. Social Media in Crisis Management: An Evaluation and Analysis of Crisis Informatics Research. International Journal of Human–Computer Interaction, 2018. URL https://doi.org/10.1080/10447318.2018.1427832.
  21. Social media and disasters: a functional framework for social media use in disaster planning, response, and research. Disasters, 39(1):1–22, 2015. URL https://doi.org/10.1111/disa.12092.
  22. David E. Alexander. Social Media in Disaster Risk Reduction and Crisis Management. Sci Eng Ethics, 20(3):717–733, 2014. URL https://doi.org/10.1007/s11948-013-9502-z.
  23. Zachary C. Steinert-Threlkeld. Spontaneous Collective Action: Peripheral Mobilization During the Arab Spring. American Political Science Review, 111(2):379–403, 2017. URL https://doi.org/10.1017/S0003055416000769.
  24. Networked discontent: The anatomy of protest campaigns in social media. Social Networks, 44:95–104, 2016. URL https://doi.org/10.1016/j.socnet.2015.07.003.
  25. Alexandra Segerberg and W. Social Media and the Organization of Collective Action: Using Twitter to Explore the Ecologies of Two Climate Change Protests. Communication Review, 2011. URL https://doi.org/10.1080/10714421.2011.597250.
  26. David Lazer. Studying human attention on the Internet. Proc Natl Acad Sci U.S.A, 117(1):21–22, January 2020. URL https://doi.org/10.1073/pnas.1919348117.
  27. Accelerating dynamics of collective attention. Nature communications, 10(1):1759, 2019. URL https://doi.org/10.1038/s41467-019-09311-w.
  28. Meaningful measures of human society in the twenty-first century. Nature, 595:189–196, 2021. URL https://doi.org/10.1038/s41586-021-03660-7.
  29. Measuring algorithmically infused societies. Nature, 595:197–204, 2021. URL https://doi.org/10.1038/s41586-021-03666-1.
  30. Coordinated inauthentic behavior and information spreading on Twitter. Decision Support Systems, 160:113819, 2022. URL https://doi.org/10.1016/j.dss.2022.113819.
  31. The spread of true and false news online. Science, 359(6380):1146–1151, 2018. URL https://doi.org/10.1126/science.aap9559.
  32. Rumor Gauge: Predicting the Veracity of Rumors on Twitter. ACM Trans Knowl Discov Data, 11(4):1–36, 2017. URL https://doi.org/10.1145/3070644.
  33. The structural virality of online diffusion. Management Science, 62(1):180–196, 2016. URL https://doi.org/10.1287/mnsc.2015.2158.
  34. Towards Multi-level Provenance Reconstruction of Information Diffusion on Social Media. In ACM Conferences, pages 1823–1826. Association for Computing Machinery, New York, NY, USA, October 2015. URL https://doi.org/10.1145/2806416.2806642.
  35. Soroush Vosoughi. Automatic detection and verification of rumors on Twitter. PhD thesis, Massachusetts Institute of Technology, 2015. URL https://dspace.mit.edu/handle/1721.1/98553.
  36. Io Taxidou and Peter M. Fischer. Online analysis of information diffusion in twitter. In ACM Other conferences, pages 1313–1318. Association for Computing Machinery, New York, NY, USA, April 2014. URL https://doi.org/10.1145/2567948.2580050.
  37. Reconstruction and analysis of Twitter conversation graphs. In ACM Conferences, pages 25–31. Association for Computing Machinery, New York, NY, USA, August 2012. URL https://doi.org/10.1145/2392622.2392626.
  38. On the challenges of predicting microscopic dynamics of online conversations. Appl Network Sci, 6(1):1–21, 2021. URL https://doi.org/10.1007/s41109-021-00357-8.
  39. The role of social networks in information diffusion. In Proceedings of the 21st international conference on World Wide Web, pages 519–528, 2012.
  40. The structure of online diffusion networks. In Proceedings of the 13th ACM conference on electronic commerce, pages 623–638, 2012. URL https://doi.org/10.1145/2229012.2229058.
  41. Mechanisms of True and False Rumor Sharing in Social Media: Collective Intelligence or Herd Behavior? Proc ACM Hum.-Comput Interact, 7(CSCW2):1–38, 2023. URL https://doi.org/10.1145/3610078.
  42. Detecting False Rumors from Retweet Dynamics on Social Media. In ACM Conferences, pages 2798–2809. Association for Computing Machinery, April 2022. URL https://doi.org/10.1145/3485447.3512000.
  43. Comparing information diffusion mechanisms by matching on cascade size. Proc Natl Acad Sci U.S.A, 118(46):e2100786118, 2021. URL https://doi.org/10.1073/pnas.2100786118.
  44. Emotions explain differences in the diffusion of true vs. false social media rumors. Sci Rep, 11(22721):1–12, 2021. URL https://doi.org/10.1038/s41598-021-01813-2.
  45. Cascade-LSTM: A Tree-Structured Neural Classifier for Detecting Misinformation Cascades. In ACM Conferences, pages 2666–2676. Association for Computing Machinery, August 2020. URL https://doi.org/10.1145/3394486.3403317.
  46. A Kernel of Truth: Determining Rumor Veracity on Twitter by Diffusion Pattern Alone. In ACM Conferences, pages 1018–1028. Association for Computing Machinery, April 2020. URL https://doi.org/10.1145/3366423.3380180.
  47. Social influence maximization under empirical influence models. Nat Hum Behav, 2:375–382, 2018. URL https://doi.org/10.1038/s41562-018-0346-z.
  48. Social Factors in Epidemiology. Science, 342(6154):47–49, 2013. URL https://doi.org/10.1126/science.1244492.
  49. Damon Centola. Social Media and the Science of Health Behavior. Circulation, 2013. URL https://www.ahajournals.org/doi/full/10.1161/CIRCULATIONAHA.112.101816.
  50. Influence and Improvisation: Participatory Disinformation during the 2020 US Election. Social Media + Society, 9(2):20563051231177943, 2023. URL https://doi.org/10.1177/20563051231177943.
  51. Influence of fake news in Twitter during the 2016 US presidential election. Nature communications, 10(1):7, 2019. URL https://doi.org/10.1038/s41467-018-07761-2.
  52. Scalable influence maximization for prevalent viral marketing in large-scale social networks. In ACM Conferences, pages 1029–1038. Association for Computing Machinery, 2010. URL https://doi.org/10.1145/1835804.1835934.
  53. Maximizing the spread of influence through a social network. In ACM Conferences, pages 137–146. Association for Computing Machinery, New York, NY, USA, August 2003. doi: 10.1145/956750.956769. URL https://doi.org/10.1145/956750.956769.
  54. #Ferguson is everywhere: initiators in emerging counterpublic networks. Information, Communication & Society, 2016. URL https://doi.org/10.1080/1369118X.2015.1106571.
  55. Vital nodes identification in complex networks. Phys Rep, 650:1–63, 2016. URL https://doi.org/10.1016/j.physrep.2016.06.007.
  56. Measuring User Influence in Twitter: The Million Follower Fallacy. ICWSM, 4(1):10–17, 2010. URL https://doi.org/10.1609/icwsm.v4i1.14033.
  57. Paul Jaccard. The distribution of the flora in the alpine zone. New Phytol, 11(2):37–50, 1912. URL https://doi.org/10.1111/j.1469-8137.1912.tb05611.x.
  58. Twitter Blog. Twitter’s Recommendation Algorithm, March 2023. URL https://blog.x.com/engineering/en_us/topics/open-source/2023/twitter-recommendation-algorithm. [Accessed: 25. Aug. 2024].
  59. How Elon Musk uses his X social media platform to amplify right-wing views, August 2024. URL https://www.pbs.org/newshour/politics/how-elon-musk-uses-his-x-social-media-platform-to-amplify-right-wing-views. Accessed: 2024-08-21.
  60. Tim Murphy. I read everything Elon Musk posted for a week. Send help., May 2024. URL https://www.motherjones.com/politics/2024/05/i-read-everything-elon-musk-posted-for-a-week-send-help. Accessed: 2024-08-21.
  61. Oliver Darcy. Radicalized by the right: Elon Musk puts his conspiratorial thinking on display for the world to see, March 2024. URL https://www.cnn.com/2024/03/19/media/elon-musk-don-lemon-interview-analysis-hnk-intl/index.html. Accessed: 2024-08-21.
  62. Sara Fischer. First look: Meta won’t recommend political content on Threads, February 2024. URL https://www.axios.com/2024/02/09/meta-political-content-moderation-threads. Accessed: 2024-08-21.
  63. Platform Transparency: Understanding the Impact of Social Media |||| United States Senate Committee on the Judiciary, May 2022. URL https://www.judiciary.senate.gov/committee-activity/hearings/platform-transparency-understanding-the-impact-of-social-media. [Accessed: 23. Aug. 2024].
  64. Rebekah Tromble. Where Have All the Data Gone? A Critical Reflection on Academic Digital Research in the Post-API Age. Social Media + Society, 7(1):2056305121988929, 2021. URL https://doi.org/10.1177/2056305121988929.
  65. Deen Freelon. Computational Research in the Post-API Age. Political Communication, 2018. URL https://www.tandfonline.com/doi/full/10.1080/10584609.2018.1477506.
  66. Social media data for conservation science: A methodological overview. Biol Conserv, 233:298–315, 2019. URL https://doi.org/10.1016/j.biocon.2019.01.023.
  67. A review and agenda for integrated disease models including social and behavioural factors. Nature Human Behaviour, 5(7):834–846, 2021. URL https://doi.org/10.1038/s41562-021-01136-2.
  68. J Sooknanan and D. M. G. Comissiong. Trending on social media: Integrating social media into infectious disease dynamics. Bulletin of Mathematical Biology, 82(7), 2020. URL https://doi.org/10.1007/s11538-020-00757-4.
  69. Timon Elmer. Computational social science is growing up: why puberty consists of embracing measurement validation, theory development, and open science practices. EPJ Data Sci, 12(1):1–19, 2023. URL https://doi.org/10.1140/epjds/s13688-023-00434-1.
  70. Social media for large studies of behavior. Science, 346(6213):1063–1064, 2014. URL https://doi.org/10.1126/science.346.6213.1063.
  71. Carter T. Butts. Revisiting the Foundations of Network Analysis. Science, 325(5939):414–416, 2009. URL https://doi.org/10.1126/science.1171022.
  72. A Multi-Platform Collection of Social Media Posts about the 2022 U.S. Midterm Elections. In Proceedings of the International AAAI Conference on Web and Social Media, volume 17, pages 981–989, June 2023. URL https://ojs.aaai.org/index.php/ICWSM/article/view/22205.
  73. Bluesky: Network Topology, Polarization, and Algorithmic Curation. arXiv Preprint, 2024. URL https://doi.org/10.48550/arXiv.2405.17571.
  74. “I’m in the Bluesky Tonight”’: Insights from a Year Worth of Social Data. arXiv Preprint, 2024. URL https://doi.org/10.48550/arXiv.2404.18984.
  75. Bluesky. Firehose API, March 2024. URL https://docs.bsky.app/docs/advanced-guides/firehose. [Accessed: 8. Mar. 2024].
  76. The Dawn of Decentralized Social Media: An Exploration of Bluesky’s Public Opening. arXiv Preprint, 2024. URL https://doi.org/10.48550/arXiv.2408.03146.
  77. Timestamps — Bluesky, October 2024. URL https://docs.bsky.app/docs/advanced-guides/timestamps. [Online; accessed 2. Oct. 2024].
  78. The bursty dynamics of the Twitter information network. In WWW ’14: Proceedings of the 23rd International Conference on World Wide Web, pages 913–924. Association for Computing Machinery, April 2014. URL https://doi.org/10.1145/2566486.2568043.
  79. Robust dynamic classes revealed by measuring the response function of a social system. Proc Natl Acad Sci U.S.A, 105(41):15649–15653, 2008. URL https://doi.org/10.1073/pnas.0803685105.
  80. Universality, criticality and complexity of information propagation in social media. Nature Communications, 13(1308):1–8, 2022. URL https://doi.org/10.1038/s41467-022-28964-8.
  81. Jetstream2: Accelerating cloud computing via jetstream. In Practice and Experience in Advanced Research Computing (PEARC ’21), pages 1–8, New York, NY, USA, 2021. Association for Computing Machinery, Association for Computing Machinery. doi: 10.1145/3437359.3465565.
  82. ACCESS: Advancing Innovation: NSF’s Advanced Cyberinfrastructure Coordination Ecosystem: Services & Support. In Proceedings of the Practice and Experience in Advanced Research Computing (PEARC ’23), page 4. Association for Computing Machinery (ACM), July 2023. URL https://doi.org/10.1145/3569951.3597559.
  83. OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. Preprint[arXiv], 2022. URL https://arxiv.org/abs/2205.01833.

Summary

  • The paper reveals that naive cascade reconstruction misidentifies influential nodes, with influence correlations on Twitter as low as 0.19.
  • The study uses case studies from Twitter and Bluesky, analyzing over 40,000 cascades to compare naive versus reconstructed network properties.
  • The paper introduces a novel Probabilistic Diffusion Inference method that enables flexible, assumption-transparent reconstruction of information cascades.

Overview of the Paper: Information Diffusion Assumptions and Social Network Dynamics

This paper, authored by Matthew R. DeVerna and colleagues, provides a detailed investigation into the assumptions underpinning information diffusion models in social networks, particularly highlighting the potential distortions these assumptions introduce in understanding social network dynamics. The work critiques the prevalent reliance on platform-provided data, which often attribute all resharing activities to an original poster, thus presenting a skewed perspective of information spread.

Key Contributions and Analysis

The authors undertake two case studies, leveraging data from Twitter and Bluesky, to underscore the critical impact of reconstructing diffusion cascades on identifying influential users within social platforms. This paper not only demonstrates the substantial inaccuracies innate to ignoring cascade reconstruction but also proposes a novel Probabilistic Diffusion Inference (PDI) approach. This method provides a parametric and flexible structure for inferring the potential cascade trees, allowing a nuanced evaluation of different assumptions typically made during cascade inference processes.

In analyzing over 40,000 true and false news cascades on Twitter, the research reveals that the assumptions adopted in cascade reconstruction significantly alter both micro and macro-scale properties of cascade networks. Specifically, the authors emphasize how these reconstructions lead to significant realignment in assessing node influence, redistribution of perceived user impact, and shifts in cascade structural properties like depth and virality.

Numerical Results and Findings

One of the salient results is the low concordance between influence measures derived from naive versus reconstructed networks. For instance, Spearman's rank correlations of node strength between naive and reconstructed networks are remarkably low, especially for Twitter (as low as 0.19). This indicates profound changes in the identification of influential nodes. Moreover, the Jaccard similarity indices show substantial divergence in the composition of top influential nodes when reconstructed and naive networks are compared, underscoring a possible misclassification of influential users.

On a broader level, the paper outlines how different reconstruction heuristics yield markedly diverse network structures. The structural virality, for one, is demonstrated to vary significantly with the assumptions employed during cascade reconstruction, with findings indicating critical dependencies on underlying methodological parameters.

Implications and Future Directions

This paper has profound implications for the broader paper of digital platforms where understanding the nuances of information spread is critical. It raises vital questions regarding current methodologies in computational social science, stressing the necessity for more robust and assumption-transparent approaches in studying social media and information diffusion.

The introduction of the PDI approach presents a promising avenue for future research, enabling researchers to customize assumptions and explore diverse dynamics of information diffusion. There is also an implicit call for collaboration with social media platforms to obtain ground-truth data which can validate and refine these diffusion inference models.

The findings encourage further investigation into which specific analyses are most sensitive to reconstruction assumptions and call for efforts to ensure the robustness of results across varying methodological frameworks.

In conclusion, the paper serves as a critical reminder of the complexities inherent in studying the dynamism of information diffusion across social networks and calls for methodological rigor in accounting for these complexities to foster a nuanced understanding of digital social dynamics.