A Public Dataset Tracking Social Media Discourse about the 2024 U.S. Presidential Election on Twitter/X (2411.00376v1)
Abstract: In this paper, we introduce the first release of a large-scale dataset capturing discourse on $\mathbb{X}$ (a.k.a., Twitter) related to the upcoming 2024 U.S. Presidential Election. Our dataset comprises 22 million publicly available posts on X.com, collected from May 1, 2024, to July 31, 2024, using a custom-built scraper, which we describe in detail. By employing targeted keywords linked to key political figures, events, and emerging issues, we aligned data collection with the election cycle to capture evolving public sentiment and the dynamics of political engagement on social media. This dataset offers researchers a robust foundation to investigate critical questions about the influence of social media in shaping political discourse, the propagation of election-related narratives, and the spread of misinformation. We also present a preliminary analysis that highlights prominent hashtags and keywords within the dataset, offering initial insights into the dominant themes and conversations occurring in the lead-up to the election. Our dataset is available at: url{https://github.com/sinking8/usc-x-24-us-election
- Unearthing a Billion Telegram Posts about the 2024 U.S. Presidential Election: Development of a Public Dataset. Technical Report. HUMANS Lab – Working Paper No. 2024.5. https://arxiv.org/abs/2410.23638.
- Exposing Cross-Platform Coordinated Inauthentic Activity in the Run-Up to the 2024 U.S. Election. Technical Report. HUMANS Lab – Working Paper No. 2024.7. https://arxiv.org/abs/2410.22716.
- Emilio Ferrara. 2024a. Charting the Landscape of Nefarious Uses of Generative Artificial Intelligence for Online Election Interference. Technical Report. HUMANS Lab – Working Paper No. 2024.1. https://arxiv.org/abs/2406.01862.
- Emilio Ferrara. 2024b. What Are The Risks of Living in a GenAI Synthetic Reality? Technical Report. HUMANS Lab – Working Paper No. 2024.2. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4883399.
- Uncovering Coordinated Cross-Platform Information Operations Threatening the Integrity of the 2024 US Presidential Election Online Discussion. Technical Report. HUMANS Lab – Working Paper No. 2024.4. https://arxiv.org/abs/2409.15402.
- Tracking the 2024 US Presidential Election Chatter on Tiktok: A Public Multimodal Dataset. Technical Report. HUMANS Lab – Working Paper No. 2024.3. https://arxiv.org/abs/2407.01471.
- Emily Chen and Emilio Ferrara. 2023. Tweets in Time of Conflict: A Public Dataset Tracking the Twitter Discourse on the War Between Ukraine and Russia. In Proceedings of the 17th International AAAI Conference on Web and Social Media, 1006–1013.