Gender Bias in Machine Translation and The Era of Large Language Models (2401.10016v1)
Abstract: This chapter examines the role of Machine Translation in perpetuating gender bias, highlighting the challenges posed by cross-linguistic settings and statistical dependencies. A comprehensive overview of relevant existing work related to gender bias in both conventional Neural Machine Translation approaches and Generative Pretrained Transformer models employed as Machine Translation systems is provided. Through an experiment using ChatGPT (based on GPT-3.5) in an English-Italian translation context, we further assess ChatGPT's current capacity to address gender bias. The findings emphasize the ongoing need for advancements in mitigating bias in Machine Translation systems and underscore the importance of fostering fairness and inclusivity in language technologies.
- Italian proposal for non-binary and inclusive language: The schwa as a non-gender–specific ending. Journal of Gay & Lesbian Mental Health, 1–6.
- Identifying and Controlling Important Neurons in Neural Machine Translation. In Proceedings of the Seventh International Conference on Learning Representations (ICLR), New Orleans, USA.
- Gender in danger? evaluating speech translation technology on the MuST-SHE corpus. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, pp. 6923–6933. Association for Computational Linguistics.
- Birhane, A. (2021). Algorithmic injustice: a relational ethics approach. Patterns 2(2).
- Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In Proceedings of Thirtieth Conference on Neural Information Processing Systems (NIPS), Barcelona, Spain, pp. 4349–4357.
- Gender bias in word embeddings: a comprehensive analysis of frequency, syntax, and semantics. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pp. 156–170.
- On measuring gender bias in translation of gender-neutral pronouns. In Proceedings of the First Workshop on Gender Bias in Natural Language Processing, Florence, Italy, pp. 173–181. Association for Computational Linguistics.
- Interpreting gender bias in neural machine translation: Multilingual architecture matters. In Proceedings of the AAAI Conference on Artificial Intelligence, Volume 36, pp. 11855–11863.
- Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, (NAACL-HLT 2019), Volume 1 (Long and Short Papers), Minneapolis, Minnesota, USA, pp. 4171–4186.
- Equalizing Gender Biases in Neural Machine Translation with Word Embeddings Techniques. In To appear in Proceedings of the 1st ACL Workshop on Gender Bias for Natural Language Processing, Florence, Italy.
- How to design translation prompts for chatgpt: An empirical study. arXiv e-prints, arXiv–2304.
- Chatgpt perpetuates gender bias in machine translation and ignores non-gendered pronouns: Findings across bengali and five other low-resource languages. arXiv preprint arXiv:2305.10510.
- Intrinsic bias metrics do not correlate with application bias. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1926–1940.
- Lipstick on a pig: Debiasing methods cover up systematic gender biases in word embeddings but do not remove them. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 609–614.
- Gu, W. (2023). Linguistically informed chatgpt prompts to enhance japanese-chinese machine translation: A case study on attributive clauses. arXiv preprint arXiv:2303.15587.
- How good are gpt models at machine translation? a comprehensive evaluation. arXiv preprint arXiv:2302.09210.
- Is chatgpt a good translator? a preliminary study. arXiv preprint arXiv:2301.08745.
- Europarl: A Parallel Corpus for Statistical Machine Translation. In Proceedings of The Tenth Machine Translation Summit (MT Summit 2005), Phuket, Thailand, pp. 79–86.
- Toward human-like evaluation for natural language generation with error analysis. arXiv preprint arXiv:2212.10179.
- Error analysis prompting enables human-like translation evaluation in large language models: A case study on chatgpt. arXiv preprint arXiv:2303.13809.
- Filling Gender & Number Gaps in Neural Machine Translation with Black-box Context Injection. In TO APPEAR IN 1st ACL Workshop on Gender Bias for Natural Language Processing, Florence, Italy.
- BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL 2002), Philadelphia, Pennsylvania, USA, pp. 311–318. Association for Computational Linguistics.
- Towards making the most of chatgpt for machine translation. arXiv preprint arXiv:2303.13780.
- Petreski, D. and I. C. Hashim (2023). Word embeddings are biased. but whose bias are they reflecting? AI & SOCIETY 38(2), 975–982.
- A call for clarity in reporting BLEU scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, Belgium, Brussels, pp. 186–191. Association for Computational Linguistics.
- Assessing gender bias in machine translation: a case study with google translate. Neural Computing and Applications 32, 6363–6381.
- Comet: A neural framework for mt evaluation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 2685–2702.
- A case study of natural gender phenomena in translation a comparison of google translate, bing microsoft translator and deepl for english to italian, french and spanish. Computational Linguistics CLiC-it 2020 359.
- Neural machine translation doesn’t translate gender coreference right unless you make it. In Proceedings of the Second Workshop on Gender Bias in Natural Language Processing, Barcelona, Spain (Online), pp. 35–43.
- Gender bias in machine translation. Transactions of the Association for Computational Linguistics 9, 845–874.
- Bleurt: Learning robust metrics for text generation. In Proceedings of ACL.
- Mitigating gender bias in machine translation with target gender annotations. In Proceedings of the Fifth Conference on Machine Translation, pp. 629–638.
- They, them, theirs: Rewriting with gender-neutral english. arXiv preprint arXiv:2102.06788.
- Sequence to sequence learning with neural networks. In Proceedings of Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems, Montreal, Quebec, Canada, pp. 3104–3112.
- The next generation of large language models.
- Toral, A. (2019). Post-editese: an exacerbated translationese. In Proceedings of Machine Translation Summit XVII: Research Track, pp. 273–281.
- Neutral rewriter: A rule-based and neural approach to automatic rewriting into gender neutral alternatives. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 8940–8948.
- Getting gender right in neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 3003–3008.
- gender-it: An annotated english-italian parallel challenge set for cross-linguistic natural gender phenomena. In Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing, pp. 1–7.
- Machine translationese: Effects of algorithmic bias on linguistic complexity in machine translation. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Online, pp. 2203–2213. Association for Computational Linguistics.
- Lost in translation: Loss and decay of linguistic richness in machine translation. In Proceedings of Machine Translation Summit XVII Volume 1: Research Track, Dublin, Ireland, pp. 222–232. European Association for Machine Translation.
- Vargha, D. (2021). Hungarian is a gender-neutral language, it has no gendered pronouns, so google translate automatically chooses the gender for you. here is how everyday sexism is.. Twitter post. March 20, 2021.
- Attention is all you need. In I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett (Eds.), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp. 5998–6008.
- Document-level machine translation with large language models. arXiv preprint arXiv:2304.02210.
- Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems 35, 24824–24837.
- Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675.
- Gender bias in coreference resolution: Evaluation and debiasing methods. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), New Orleans, Louisiana, pp. 15–20. Association for Computational Linguistics.
- Eva Vanmassenhove (13 papers)