Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks (2004.05861v4)

Published 13 Apr 2020 in cs.CL, cs.IR, and cs.SI

Abstract: In this paper, we present ArCOV-19, an Arabic COVID-19 Twitter dataset that spans one year, covering the period from 27th of January 2020 till 31st of January 2021. ArCOV-19 is the first publicly-available Arabic Twitter dataset covering COVID-19 pandemic that includes about 2.7M tweets alongside the propagation networks of the most-popular subset of them (i.e., most-retweeted and -liked). The propagation networks include both retweets and conversational threads (i.e., threads of replies). ArCOV-19 is designed to enable research under several domains including natural language processing, information retrieval, and social computing. Preliminary analysis shows that ArCOV-19 captures rising discussions associated with the first reported cases of the disease as they appeared in the Arab world. In addition to the source tweets and propagation networks, we also release the search queries and language-independent crawler used to collect the tweets to encourage the curation of similar datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Fatima Haouari (9 papers)
  2. Maram Hasanain (24 papers)
  3. Reem Suwaileh (10 papers)
  4. Tamer Elsayed (22 papers)
Citations (66)