Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Gender Biases Unexpectedly Fluctuate in the Pre-training Stage of Masked Language Models (2211.14639v1)

Published 26 Nov 2022 in cs.CL

Abstract: Masked LLMs pick up gender biases during pre-training. Such biases are usually attributed to a certain model architecture and its pre-training corpora, with the implicit assumption that other variations in the pre-training process, such as the choices of the random seed or the stopping point, have no effect on the biases measured. However, we show that severe fluctuations exist at the fundamental level of individual templates, invalidating the assumption. Further against the intuition of how humans acquire biases, these fluctuations are not correlated with the certainty of the predicted pronouns or the profession frequencies in pre-training corpora. We release our code and data to benefit future research.

Citations (1)

Summary

We haven't generated a summary for this paper yet.