Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words (2407.16266v1)

Published 23 Jul 2024 in cs.CL

Abstract: Gender bias has been a focal point in the study of bias in machine translation and LLMs. Existing machine translation gender bias evaluations are primarily focused on male and female genders, limiting the scope of the evaluation. To assess gender bias accurately, these studies often rely on calculating the accuracy of gender pronouns or the masculine and feminine attributes of grammatical gender via the stereotypes triggered by occupations or sentiment words ({\em i.e.}, clear positive or negative attitude), which cannot extend to non-binary groups. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words), which assesses gender bias beyond binary gender. Meanwhile, we propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words. In evaluating three recent and effective open-source LLMs and one powerful multilingual translation-specific model, our main observations are: (1) The translation performance within non-binary gender contexts is markedly inferior in terms of translation quality and exhibits more negative attitudes than binary-gender contexts. (2) The analysis experiments indicate that incorporating constraint context in prompts for gender identity terms can substantially reduce translation bias, while the bias remains evident despite the presence of the constraints. The code is publicly available at \url{https://github.com/pppa2019/ambGIMT}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. Unmasking contextual stereotypes: Measuring and mitigating bert’s gender bias. In Proceedings of the Second Workshop on Gender Bias in Natural Language Processing, pp.  1–16, 2020.
  2. Are models biased on text without gender-related language? In The Twelfth International Conference on Learning Representations, 2023.
  3. Language (technology) is power: A critical survey of” bias” in nlp. arXiv preprint arXiv:2005.14050, 2020.
  4. Butterfield, J. Fowler’s dictionary of modern english usage. Oxford University Press, 2015.
  5. On measuring gender bias in translation of gender-neutral pronouns. Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pp.  173–181, 2019.
  6. Multilingual holistic bias: Extending descriptors and patterns to unveil demographic biases in languages at scale. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pp.  14141–14156, 2023.
  7. No language left behind: Scaling human-centered machine translation. arXiv preprint arXiv:2207.04672, 2022.
  8. Mt-geneval: A counterfactual and contextual dataset for evaluating gender accuracy in machine translation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp.  4287–4299, 2022.
  9. Bold: Dataset and metrics for measuring biases in open-ended language generation. In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pp.  862–872, 2021.
  10. Queer people are people first: Deconstructing sexual identity stereotypes in large language models. arXiv preprint arXiv:2307.00101, 2023.
  11. The American heritage dictionary. Dell Publishing Company, 2000.
  12. Word alignment by fine-tuning embeddings on parallel corpora. arXiv preprint arXiv:2101.08231, 2021.
  13. Winoqueer: A community-in-the-loop benchmark for anti-lgbtq+ bias in large language models. arXiv preprint arXiv:2306.15087, 2023.
  14. Bias runs deep: Implicit reasoning biases in persona-assigned llms. arXiv preprint arXiv:2311.04892, 2023.
  15. Social perception of non-binary individuals. Archives of Sexual Behavior, 51(4):2027–2035, 2022.
  16. “you sound just like your father” commercial machine translation systems include stylistic biases. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  1686–1690, 2020.
  17. Minicpm: Unveiling the potential of end-side large language models. 2024.
  18. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.
  19. Examining gender and race bias in two hundred sentiment analysis systems. arXiv preprint arXiv:1805.04508, 2018.
  20. Measuring bias in contextualized word representations. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, pp.  166–172, 2019.
  21. Collecting a large-scale gender bias dataset for coreference resolution and machine translation. arXiv preprint arXiv:2109.03858, 2021.
  22. A diversity-promoting objective function for neural conversation models. arXiv preprint arXiv:1510.03055, 2015.
  23. Lodge, C. Gender census 2019–the full report (worldwide). Zugriff am, 9:2023, 2019.
  24. Stereoset: Measuring stereotypical bias in pretrained language models. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp.  5356–5371, 2021.
  25. Crows-pairs: A challenge dataset for measuring social biases in masked language models. arXiv preprint arXiv:2010.00133, 2020.
  26. Are you talking to [’xem’] or [’x’,’em’]? on tokenization and addressing misgendering in llms with pronoun tokenization parity. arXiv preprint arXiv:2312.11779, 2023.
  27. Gate: A challenge set for gender-ambiguous translation examples. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society, pp.  845–854, 2023.
  28. Gender bias in coreference resolution. arXiv preprint arXiv:1804.09301, 2018.
  29. Masked language model scoring. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.  2699–2712, 2020.
  30. Shearer, J. Enforcing the gender binary and its implications on nonbinary identities: an exploration of the linguistic and social erasure of nonbinary individuals in the united states. Binghamton University Undergraduate Journal, 5(1):7, 2019.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Yijie Chen (10 papers)
  2. Yijin Liu (29 papers)
  3. Fandong Meng (174 papers)
  4. Jinan Xu (64 papers)
  5. Yufeng Chen (58 papers)
  6. Jie Zhou (687 papers)