Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ToxicChat: Unveiling Hidden Challenges of Toxicity Detection in Real-World User-AI Conversation (2310.17389v1)

Published 26 Oct 2023 in cs.CL and cs.AI

Abstract: Despite remarkable advances that LLMs have achieved in chatbots, maintaining a non-toxic user-AI interactive environment has become increasingly critical nowadays. However, previous efforts in toxicity detection have been mostly based on benchmarks derived from social media content, leaving the unique challenges inherent to real-world user-AI interactions insufficiently explored. In this work, we introduce ToxicChat, a novel benchmark based on real user queries from an open-source chatbot. This benchmark contains the rich, nuanced phenomena that can be tricky for current toxicity detection models to identify, revealing a significant domain difference compared to social media content. Our systematic evaluation of models trained on existing toxicity datasets has shown their shortcomings when applied to this unique domain of ToxicChat. Our work illuminates the potentially overlooked challenges of toxicity detection in real-world user-AI conversations. In the future, ToxicChat can be a valuable resource to drive further advancements toward building a safe and healthy environment for user-AI interactions.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Zi Lin (19 papers)
  2. Zihan Wang (181 papers)
  3. Yongqi Tong (8 papers)
  4. Yangkun Wang (9 papers)
  5. Yuxin Guo (21 papers)
  6. Yujia Wang (29 papers)
  7. Jingbo Shang (141 papers)
Citations (63)