Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Promoting Equality in Large Language Models: Identifying and Mitigating the Implicit Bias based on Bayesian Theory (2408.10608v1)

Published 20 Aug 2024 in cs.CL and cs.AI

Abstract: LLMs are trained on extensive text corpora, which inevitably include biased information. Although techniques such as Affective Alignment can mitigate some negative impacts of these biases, existing prompt-based attack methods can still extract these biases from the model's weights. Moreover, these biases frequently appear subtly when LLMs are prompted to perform identical tasks across different demographic groups, thereby camouflaging their presence. To address this issue, we have formally defined the implicit bias problem and developed an innovative framework for bias removal based on Bayesian theory, Bayesian-Theory based Bias Removal (BTBR). BTBR employs likelihood ratio screening to pinpoint data entries within publicly accessible biased datasets that represent biases inadvertently incorporated during the LLM training phase. It then automatically constructs relevant knowledge triples and expunges bias information from LLMs using model editing techniques. Through extensive experimentation, we have confirmed the presence of the implicit bias problem in LLMs and demonstrated the effectiveness of our BTBR approach.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Yongxin Deng (6 papers)
  2. Xihe Qiu (14 papers)
  3. Xiaoyu Tan (21 papers)
  4. Jing Pan (25 papers)
  5. Chen Jue (1 paper)
  6. Zhijun Fang (8 papers)
  7. Yinghui Xu (48 papers)
  8. Wei Chu (118 papers)
  9. Yuan Qi (85 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets