Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Hateful Person or Hateful Model? Investigating the Role of Personas in Hate Speech Detection by Large Language Models (2506.08593v1)

Published 10 Jun 2025 in cs.CL

Abstract: Hate speech detection is a socially sensitive and inherently subjective task, with judgments often varying based on personal traits. While prior work has examined how socio-demographic factors influence annotation, the impact of personality traits on LLMs remains largely unexplored. In this paper, we present the first comprehensive study on the role of persona prompts in hate speech classification, focusing on MBTI-based traits. A human annotation survey confirms that MBTI dimensions significantly affect labeling behavior. Extending this to LLMs, we prompt four open-source models with MBTI personas and evaluate their outputs across three hate speech datasets. Our analysis uncovers substantial persona-driven variation, including inconsistencies with ground truth, inter-persona disagreement, and logit-level biases. These findings highlight the need to carefully define persona prompts in LLM-based annotation workflows, with implications for fairness and alignment with human values.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Shuzhou Yuan (12 papers)
  2. Ercong Nie (25 papers)
  3. Mario Tawfelis (2 papers)
  4. Helmut Schmid (20 papers)
  5. Hinrich Schütze (250 papers)
  6. Michael Färber (65 papers)

Summary

We haven't generated a summary for this paper yet.