Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Automatic Bot Detection in Twitter for Health-related Tasks (1909.13184v1)

Published 29 Sep 2019 in cs.CL and cs.SI

Abstract: With the increasing use of social media data for health-related research, the credibility of the information from this source has been questioned as the posts may originate from automated accounts or "bots". While automatic bot detection approaches have been proposed, there are none that have been evaluated on users posting health-related information. In this paper, we extend an existing bot detection system and customize it for health-related research. Using a dataset of Twitter users, we first show that the system, which was designed for political bot detection, underperforms when applied to health-related Twitter users. We then incorporate additional features and a statistical machine learning classifier to significantly improve bot detection performance. Our approach obtains F_1 scores of 0.7 for the "bot" class, representing improvements of 0.339. Our approach is customizable and generalizable for bot detection in other health-related social media cohorts.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Anahita Davoudi (1 paper)
  2. Ari Z. Klein (4 papers)
  3. Abeed Sarker (24 papers)
  4. Graciela Gonzalez-Hernandez (7 papers)
Citations (17)