WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models (2306.15087v2)

Published 26 Jun 2023 in cs.CL and cs.CY

Abstract: We present WinoQueer: a benchmark specifically designed to measure whether LLMs encode biases that are harmful to the LGBTQ+ community. The benchmark is community-sourced, via application of a novel method that generates a bias benchmark from a community survey. We apply our benchmark to several popular LLMs and find that off-the-shelf models generally do exhibit considerable anti-queer bias. Finally, we show that LLM bias against a marginalized community can be somewhat mitigated by finetuning on data written about or by members of that community, and that social media text written by community members is more effective than news text written about the community by non-members. Our method for community-in-the-loop benchmark development provides a blueprint for future researchers to develop community-driven, harms-grounded LLM benchmarks for other marginalized communities. Note: This version corrects a bug found in evaluation code after publication. General findings have not changed, but tables 5 and 6 and figure 1 have been corrected.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (23)

Authors (4)

Virginia K. Felkner (3 papers)
Ho-Chun Herbert Chang (19 papers)
Eugene Jang (10 papers)
Jonathan May (76 papers)

Citations (26)

View on Semantic Scholar

Tweets

https://twitter.com/WGOV/status/1848423920950510032

WinoQueer: A Community-in-the-Loop Benchmark for Anti-LGBTQ+ Bias in Large Language Models (2306.15087v2)

Related Papers

Tweets