Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Unsupervised Discovery of Gendered Language through Latent-Variable Modeling (1906.04760v1)

Published 11 Jun 2019 in cs.CL

Abstract: Studying the ways in which language is gendered has long been an area of interest in sociolinguistics. Studies have explored, for example, the speech of male and female characters in film and the language used to describe male and female politicians. In this paper, we aim not to merely study this phenomenon qualitatively, but instead to quantify the degree to which the language used to describe men and women is different and, moreover, different in a positive or negative way. To that end, we introduce a generative latent-variable model that jointly represents adjective (or verb) choice, with its sentiment, given the natural gender of a head (or dependent) noun. We find that there are significant differences between descriptions of male and female nouns and that these differences align with common gender stereotypes: Positive adjectives used to describe women are more often related to their bodies than adjectives used to describe men.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Alexander Hoyle (13 papers)
  2. Wolf-Sonkin (1 paper)
  3. Hanna Wallach (48 papers)
  4. Isabelle Augenstein (131 papers)
  5. Ryan Cotterell (226 papers)
Citations (51)