Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

The Use of Web Archives in Disinformation Research (2306.10004v1)

Published 16 Jun 2023 in cs.DL

Abstract: In recent years, journalists and other researchers have used web archives as an important resource for their study of disinformation. This paper provides several examples of this use and also brings together some of the work that the Old Dominion University Web Science and Digital Libraries (WS-DL) research group has done in this area. We will show how web archives have been used to investigate changes to webpages, study archived social media including deleted content, and study known disinformation that has been archived.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Amelia Acker. 2018. Data Craft: The Manipulation of Social Media Metadata. Data & Society (Nov. 2018). https://datasociety.net/library/data-craft
  2. Amelia Acker and Mitch Chaiet. 2020. The weaponization of web archives: Data craft and COVID-19 publics. Harvard Kennedy School Misinformation Review (Sept. 2020). https://misinforeview.hks.harvard.edu/article/the-weaponization-of-web-archives-data-craft-and-covid-19-publics/
  3. Andrew Solender Alexi McCammond. 2022. The big scrub. Axios (Aug. 2022). https://www.axios.com/2022/08/31/republicans-midterms
  4. Acting the Part: Examining Information Operations Within #BlackLivesMatter Discourse. In Proceedings of the ACM on Human-Computer Interaction. https://doi.org/10.1145/3274289
  5. Caleb Bradford and Michael L. Nelson. 2022. Did They Really Tweet That? Querying Fact-Checking Sites and Politwoops to Determine Tweet Misattribution. Technical Report arXiv:2211.09681. arXiv. https://arxiv.org/abs/2211.09681
  6. Less than 4% of Archived Instagram Account Pages for the Disinformation Dozen are Replayable. In Proceedings of ACM/IEEE Joint Conference on Digital Libraries (JCDL). https://www.cs.odu.edu/~mweigle/papers/bragg-jcdl2023-preprint.pdf short paper.
  7. Haley Bragg and Michele C. Weigle. 2023. Discovering the Traces of Disinformation on Instagram in the Internet Archive. Technical Report arXiv:2301.09188. arXiv. https://arxiv.org/abs/2301.09188
  8. The Impact of JavaScript on Archivability. International Journal of Digital Libraries (IJDL) 17, 2 (June 2016), 95–117. https://doi.org/10.1007/s00799-015-0140-8
  9. Archival Crawlers and JavaScript: Discover More Stuff but Crawl More Slowly. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (JCDL). Toronto, Ontario, Canada. https://doi.org/10.1109/JCDL.2017.7991554
  10. The long fuse: Misinformation and the 2020 election. Stanford Digital Repository: Election Integrity Partnership. https://www.eipartnership.net/report/
  11. Center for Countering Digital Hate. 2021. The Disinformation Dozen. https://counterhate.com/research/the-disinformation-dozen/.
  12. Josh Constine. 2018. Facebook restricts APIs, axes old Instagram platform amidst scandals. TechCrunch (April 2018). https://techcrunch.com/2018/04/04/facebook-instagram-api-shut-down/
  13. The Tactics & Tropes of the Internet Research Agency. https://www.intelligence.senate.gov/sites/default/files/documents/NewKnowledge-Disinformation-Report-Whitepaper.pdf.
  14. Caleb Ecarma. 2018. Joy Reid Claims Homophobic Posts From Her Blog Were ’Fabricated’. Mediaite (April 2018). https://www.mediaite.com/online/exclusive-joy-reid-claims-newly-discovered-homophobic-posts-from-her-blog-were-fabricated
  15. Election Integrity Partnership. 2021. Discussion of “The long fuse: Misinformation and the 2020 election”. https://www.youtube.com/watch?v=uKkaZI-EiiQ
  16. Bathsheba Farrow. 2022. Disinformation Detection and Analytics REU Program - Final Summer Presentations. https://ws-dl.blogspot.com/2022/08/2022-08-19-disinformation-detection-and.html.
  17. Lesley Frew. 2022. Web Archiving in Popular Media II: User Tasks of Journalists. https://ws-dl.blogspot.com/2022/08/2022-08-04-web-archiving-in-popular.html
  18. Lesley Frew. 2023. Animating Changes in Webpages, Featuring George Santos’s Biography. https://ws-dl.blogspot.com/2023/02/2023-02-26-animating-changes-in.html
  19. Making Changes in Webpages Discoverable: A Change-Text Search Interface for Web Archives. In Proceedings of ACM/IEEE Joint Conference on Digital Libraries (JCDL). https://arxiv.org/abs/2305.00546
  20. Kritika Garg and Himarsha Jayanetti. 2020a. Twitter Added Labels On Its Old User Interface. https://ws-dl.blogspot.com/2020/12/2020-12-08-twitter-added-labels-on-its.html.
  21. Kritika Garg and Himarsha Jayanetti. 2020b. Twitter Was Already Difficult To Archive, Now It’s Worse! https://ws-dl.blogspot.com/2020/07/2020-07-15-twitter-was-already.html.
  22. Replaying Archived Twitter: When your bird is broken, will it bring you down?. In Proceedings of ACM/IEEE Joint Conference on Digital Libraries (JCDL). 160–169. https://doi.org/10.1109/JCDL52503.2021.00028
  23. Mark Graham. 2019. The Wayback Machine’s Save Page Now is New and Improved. Internet Archive Blogs. https://blog.archive.org/2019/10/23/the-wayback-machines-save-page-now-is-new-and-improved
  24. Mark Graham. 2020. Fact Checks and Context for Wayback Machine Pages. Internet Archive Blogs. https://blog.archive.org/2020/10/30/fact-checks-and-context-for-wayback-machine-pages/
  25. Camilla Hodgson. 2019. How the Internet Archive is waging war on misinformation. Financial Times (Sept. 2019). https://www.ft.com/content/5be1f2ee-d60b-11e9-a0bd-ab8ec6435630
  26. Philip N. Howard. 2019. A Way to Detect the Next Russian Misinformation Campaign. The New York Times (March 2019). https://www.nytimes.com/2019/03/27/opinion/russia-elections-facebook.html
  27. Himarsha Jayanetti. 2020. How well is Instagram archived? https://ws-dl.blogspot.com/2020/11/2020-11-04-how-well-is-instagram.html.
  28. Himarsha Jayanetti and Kritika Garg. 2020. New Twitter UI: Replaying Archived Twitter Pages That Never Existed. https://ws-dl.blogspot.com/2020/11/2020-11-04-new-twitter-ui-replaying.html.
  29. Andrew Kaczynski and Em Steck. 2022. Republican Senate candidate Blake Masters scrubbed language on campaign website saying the 2020 election was stolen from Trump. CNN (Aug. 2022). https://www.cnn.com/2022/08/29/politics/blake-masters-campaign-website-changes/index.html
  30. Mat Kelly. 2022. Collaborative Study Highlighting the Importance of Web Ads Funded by IMLS. https://ws-dl.blogspot.com/2022/08/2022-08-17-collaborative-study.html.
  31. Adam Kriesberg and Amelia Acker. 2022. The second US presidential social media transition: How private platforms impact the digital preservation of public records. Journal of the Association for Information Science and Technology 73, 11 (Nov. 2022), 1529–1542. https://doi.org/10.1002/asi.24659
  32. Justin Littman. 2017a. Suspended U.S. government Twitter accounts. https://gwu-libraries.github.io/sfm-ui/posts/2017-11-04-digital-registry.
  33. Justin Littman. 2017b. Vulnerabilities in the U.S. Digital Registry, Twitter, and the Internet Archive. https://gwu-libraries.github.io/sfm-ui/posts/2017-11-06-vulnerabilities.
  34. Taylor Lorenz. 2022. Meet the woman behind Libs of TikTok, secretly fueling the right’s outrage machine. Washington Post (April 2022). https://www.washingtonpost.com/technology/2022/04/19/libs-of-tiktok-right-wing-media
  35. Michael L. Nelson. 2018. Why we need multiple web archives: the case of blog.reidreport.com. https://ws-dl.blogspot.com/2018/04/2018-04-24-why-we-need-multiple-web.html.
  36. Miguel Ramalho. 2022. Preserve Vital Online Content With Bellingcat’s Auto Archiver. bellingcat (Sept. 2022). https://www.bellingcat.com/resources/2022/09/22/preserve-vital-online-content-with-bellingcats-auto-archiver-tool
  37. Reuters Staff. 2021. Fact check: ‘I’ll believe in climate change when Texas freezes over’ Ted Cruz tweet appears to be fabricated. Reuters (Feb. 2021). https://www.reuters.com/article/uk-factcheck-fake-tweet-ted-cruz/fact-check-ill-believe-in-climate-change-when-texas-freezes-over-ted-cruz-tweet-appears-to-be-fabricated-idUSKBN2AJ2HH
  38. Laurie Roberts. 2022. The Blake Masters abortion makeover continues as he quietly scrubs his website. Arizona Republic (Aug. 2022). https://www.azcentral.com/story/opinion/op-ed/laurieroberts/2022/08/25/blake-masters-abortion-makeover-continues-website-scrub/7898088001
  39. Kate Starbird. 2016. Tracing Disinformation Trajectories from the 2010 Deepwater Horizon Oil Spill. https://medium.com/hci-design-at-uw/tracing-disinformation-trajectories-from-the-2010-deepwater-horizon-oil-spill-79e8116e08f4.
  40. Kate Starbird. 2017. Examining the Alternative Media Ecosystem through the Production of Alternative Narratives of Mass Shooting Events on Twitter. In Proceedings of the International AAAI Conference on Web and Social Media. https://doi.org/10.1609/icwsm.v11i1.14878
  41. Kate Starbird. 2018. Content Sharing within the Alternative Media Echo-System: The Case of the White Helmets. https://medium.com/@katestarbird/content-sharing-within-the-alternative-media-echo-system-the-case-of-the-white-helmets-f34434325e77.
  42. Ecosystem or Echo-System? Exploring Content Sharing across Alternative Media Domains. In Proceedings of the International AAAI Conference on Web and Social Media (ICWSM). https://faculty.washington.edu/kstarbi/Starbird-et-al-ICWSM-2018-Echosystem-final.pdf
  43. Michele C. Weigle. 2018. On the importance of web archiving. SSRC Parameters (Sept. 2018). https://items.ssrc.org/parameters/on-the-importance-of-web-archiving/
  44. Michele C. Weigle. 2022. Using Web Archives in Disinformation Research. https://ws-dl.blogspot.com/2022/09/2022-09-28-using-web-archives-in.html
  45. Right HTML, Wrong JSON: Challenges in Replaying Archived Webpages Built with Client-Side Rendering. In Proceedings of ACM/IEEE Joint Conference on Digital Libraries (JCDL). https://arxiv.org/abs/2305.01071
  46. Tarannum Zaki. 2022. Disinformation Spread on Social Media through Screenshot Sharing: Dataset Description. https://ws-dl.blogspot.com/2022/12/2022-12-12-disinformation-spread-on.html.
  47. Extracting Information from Twitter Screenshots. Technical Report arXiv:2306.08236. arXiv. https://arxiv.org/abs/2306.08236
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com