Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings (2412.06134v1)

Published 9 Dec 2024 in cs.CL

Abstract: Current social bias benchmarks for LLMs primarily rely on pre-defined question formats like multiple-choice, limiting their ability to reflect the complexity and open-ended nature of real-world interactions. To address this gap, we extend an existing BBQ dataset introduced by incorporating fill-in-the-blank and short-answer question types, designed to evaluate biases in an open-ended setting. Our finding reveals that LLMs tend to produce responses that are more biased against certain protected attributes, like age and socio-economic status. On the other hand, these biased outputs produced by LLMs can serve as valuable contexts and chains of thought for debiasing. Our debiasing approach combined zero-shot, few-shot, and chain-of-thought could significantly reduce the level of bias to almost 0. We open-source our evaluation and debiasing code hoping to encourage further measurements and mitigation of bias and stereotype in LLMs.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

Authors (1)

Zhao Liu (97 papers)

Evaluating and Mitigating Social Bias for Large Language Models in Open-ended Settings (2412.06134v1)

Related Papers