Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

How well can machine learning predict demographics of social media users? (1702.01807v2)

Published 6 Feb 2017 in cs.SI and cs.CY

Abstract: The wide use of social media sites and other digital technologies have resulted in an unprecedented availability of digital data that are being used to study human behavior across research domains. Although unsolicited opinions and sentiments are available on these platforms, demographic details are usually missing. Demographic information is pertinent in fields such as demography and public health, where significant differences can exist across sex, racial and socioeconomic groups. In an attempt to address this shortcoming, a number of academic studies have proposed methods for inferring the demographics of social media users using details such as names, usernames, and network characteristics. Gender is the easiest trait to accurately infer, with measures of accuracy higher than 90 percent in some studies. Race, ethnicity and age tend to be more challenging to predict for a variety of reasons including the novelty of social media to certain age groups and a lack of significant deviations in user details across racial and ethnic groups. Although the endeavor to predict user demographics is plagued with ethical questions regarding privacy and data ownership, knowing the demographics in a data sample can aid in addressing issues of bias and population representation, so that existing societal inequalities are not exacerbated.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Nina Cesare (3 papers)
  2. Christan Grant (13 papers)
  3. Quynh Nguyen (25 papers)
  4. Hedwig Lee (2 papers)
  5. Elaine O. Nsoesie (8 papers)
Citations (52)