Identify the sources of GPT-3.5’s hiring-related biases
Determine the underlying factors that led to the gender, racial, and other biases observed in GPT-3.5 when auditing resume assessment and resume generation tasks in a United States hiring context, with particular attention to biases arising from the model’s training data. Establish how characteristics of the training data and related components contribute to the measured disparities in scores and generated resume content.
References
While we cannot conclude what led to the biases we observed, a fundamental limitation of algorithm auditing, we encourage future work that builds and analyzes LLMs for such biases in training data and elsewhere.
— The Silicon Ceiling: Auditing GPT's Race and Gender Biases in Hiring
(2405.04412 - Armstrong et al., 7 May 2024) in Discussion, Reflecting on Potential Sources of Bias