Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

What about em? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns (2305.16051v1)

Published 25 May 2023 in cs.CL, cs.AI, and cs.CY

Abstract: As 3rd-person pronoun usage shifts to include novel forms, e.g., neopronouns, we need more research on identity-inclusive NLP. Exclusion is particularly harmful in one of the most popular NLP applications, machine translation (MT). Wrong pronoun translations can discriminate against marginalized groups, e.g., non-binary individuals (Dev et al., 2021). In this ``reality check'', we study how three commercial MT systems translate 3rd-person pronouns. Concretely, we compare the translations of gendered vs. gender-neutral pronouns from English to five other languages (Danish, Farsi, French, German, Italian), and vice versa, from Danish to English. Our error analysis shows that the presence of a gender-neutral pronoun often leads to grammatical and semantic translation errors. Similarly, gender neutrality is often not preserved. By surveying the opinions of affected native speakers from diverse languages, we provide recommendations to address the issue in future MT research.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Anne Lauscher (58 papers)
  2. Debora Nozza (17 papers)
  3. Archie Crowley (2 papers)
  4. Ehm Miltersen (1 paper)
  5. Dirk Hovy (57 papers)
Citations (17)