Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models (2406.13925v3)

Published 20 Jun 2024 in cs.CL and cs.AI

Abstract: LLMs are prone to generating content that exhibits gender biases, raising significant ethical concerns. Alignment, the process of fine-tuning LLMs to better align with desired behaviors, is recognized as an effective approach to mitigate gender biases. Although proprietary LLMs have made significant strides in mitigating gender bias, their alignment datasets are not publicly available. The commonly used and publicly available alignment dataset, HH-RLHF, still exhibits gender bias to some extent. There is a lack of publicly available alignment datasets specifically designed to address gender bias. Hence, we developed a new dataset named GenderAlign, aiming at mitigating a comprehensive set of gender biases in LLMs. This dataset comprises 8k single-turn dialogues, each paired with a "chosen" and a "rejected" response. Compared to the "rejected" responses, the "chosen" responses demonstrate lower levels of gender bias and higher quality. Furthermore, we categorized the gender biases in the "rejected" responses of GenderAlign into 4 principal categories. The experimental results show the effectiveness of GenderAlign in reducing gender bias in LLMs.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Tao Zhang (481 papers)
  2. Ziqian Zeng (32 papers)
  3. Yuxiang Xiao (1 paper)
  4. Huiping Zhuang (44 papers)
  5. Cen Chen (81 papers)
  6. James Foulds (17 papers)
  7. Shimei Pan (28 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.