Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Pretrained Multilingual Models Equally Fair Across Languages? (2210.05457v1)

Published 11 Oct 2022 in cs.CL

Abstract: Pretrained multilingual LLMs can help bridge the digital language divide, enabling high-quality NLP models for lower resourced languages. Studies of multilingual models have so far focused on performance, consistency, and cross-lingual generalisation. However, with their wide-spread application in the wild and downstream societal impact, it is important to put multilingual models under the same scrutiny as monolingual models. This work investigates the group fairness of multilingual models, asking whether these models are equally fair across languages. To this end, we create a new four-way multilingual dataset of parallel cloze test examples (MozArt), equipped with demographic information (balanced with regard to gender and native tongue) about the test participants. We evaluate three multilingual models on MozArt -- mBERT, XLM-R, and mT5 -- and show that across the four target languages, the three models exhibit different levels of group disparity, e.g., exhibiting near-equal risk for Spanish, but high levels of disparity for German.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Laura Cabello Piqueras (2 papers)
  2. Anders Søgaard (120 papers)
Citations (7)