KoBBQ: Korean Bias Benchmark for Question Answering (2307.16778v2)

Published 31 Jul 2023 in cs.CL and cs.AI

Abstract: The Bias Benchmark for Question Answering (BBQ) is designed to evaluate social biases of LLMs (LMs), but it is not simple to adapt this benchmark to cultural contexts other than the US because social biases depend heavily on the cultural context. In this paper, we present KoBBQ, a Korean bias benchmark dataset, and we propose a general framework that addresses considerations for cultural adaptation of a dataset. Our framework includes partitioning the BBQ dataset into three classes--Simply-Transferred (can be used directly after cultural translation), Target-Modified (requires localization in target groups), and Sample-Removed (does not fit Korean culture)-- and adding four new categories of bias specific to Korean culture. We conduct a large-scale survey to collect and validate the social biases and the targets of the biases that reflect the stereotypes in Korean culture. The resulting KoBBQ dataset comprises 268 templates and 76,048 samples across 12 categories of social bias. We use KoBBQ to measure the accuracy and bias scores of several state-of-the-art multilingual LMs. The results clearly show differences in the bias of LMs as measured by KoBBQ and a machine-translated version of BBQ, demonstrating the need for and utility of a well-constructed, culturally-aware social bias benchmark.

Authors (6)

Jiho Jin (15 papers)
Jiseon Kim (12 papers)
Nayeon Lee (28 papers)
Haneul Yoo (21 papers)
Alice Oh (82 papers)
Hwaran Lee (31 papers)

Citations (20)

View on Semantic Scholar

Summary

Analysis and Insights on "KoBBQ: Korean Bias Benchmark for Question Answering"

The paper titled "KoBBQ: Korean Bias Benchmark for Question Answering" introduces a substantial dataset and framework designed to assess social biases within LLMs in the context of Korean culture. Building upon the pre-existing Bias Benchmark for Question Answering (BBQ), which primarily provides insights within a US-centric cultural framework, this paper delineates a methodological approach to creating a culturally specific dataset that captures the nuances of Korean societal biases effectively.

Framework and Dataset Construction

The paper introduces KoBBQ, a dataset instrumental in identifying and evaluating biases in Korean LLMs through a structured process for cultural adaptation. This involves categorizing the BBQ dataset into three classes: Simply-Transferred, Target-Modified, and Sample-Removed to reflect their suitability in the Korean context. Newly added categories particular to Korean societal contexts include Domestic Area of Origin, Family Structure, Political Orientation, and Educational Background. These unique social biases were identified and validated through a comprehensive survey of a large sample of Korean individuals, which served to tailor the dataset precisely to the explicit and implicit biases prevalent in South Korea.

KoBBQ encompasses an extensive array of 268 templates and 76,048 question-answer samples across its 12 categories. These are crafted to ensure a robust and thorough assessment mechanism for generative LLMs operating within the cultural confines of Korea. Significantly, the paper underscores the shortcomings of directly applying machine-translated versions of BBQ in another linguistic or cultural setting, validating the necessity of human-mediated adaptation processes for ensuring accuracy and contextual relevance.

Evaluation of LLMs

The paper employed KoBBQ to evaluate the state-of-the-art multilingual LLMs, highlighting discrepancies in bias detection when using KoBBQ versus a machine-translated BBQ. The evaluation metrics centered around quantifying model accuracy and inherent bias, exposing substantial variations that underscore the effectiveness of KoBBQ in unveiling biases that were otherwise unassessable through translation dependent methodologies. Of note, GPT-4 achieved the highest accuracy levels, while a uniform trend was observed wherein all models exhibited a higher accuracy in the disambiguated context.

Crucially, in analyzing the diff-bias score, which measures the predictability of biased versus counter-biased answers, a marked increase in bias was observed under ambiguous contexts. This suggests a discernible tilt towards social stereotypes, a significant finding that opens pathways for further exploration in mitigating such biases.

Implications and Future Directions

The implications of this research are twofold: practically, KoBBQ constitutes a quintessential benchmark dataset for evaluating and mitigating inherent biases in Korean-focused language technologies. Theoretically, the framework outlined for dataset construction provides a roadmap for developing similar bias benchmarks in other cultural contexts, demonstrating wider applicability and potential for cross-cultural NLP research.

Looking forward, expansion of this methodology to incorporate other linguistic contexts, while ambitiously integrating universal traits across multiple cultures, can lead to more inclusive and diversified bias evaluations. Additionally, the dataset's alignment with societal stereotypes as verified by large-scale public surveys contributes to establishing a foundation for defensive mechanisms against biased outputs in LLMs.

In summary, this research offers significant contributions to the field of AI ethics and algorithmic fairness, setting a standard for culturally sensitive dataset development, and potentially guiding policy-making and model training frameworks to integrate cultural nuances responsibly.

PDF Markdown

Related Papers

Find Related Papers

Tweets

https://twitter.com/aliceoh/status/1751715282652143781

https://twitter.com/aliceoh/status/1775430246269063189

https://twitter.com/hwaran_lee/status/1764913529583435850

https://twitter.com/aliceoh/status/1762646372329869591