Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Detecting Contextual Real-Time Toxicity for In-Game Chat (2310.18330v1)

Published 20 Oct 2023 in cs.CL, cs.AI, and cs.LG

Abstract: Real-time toxicity detection in online environments poses a significant challenge, due to the increasing prevalence of social media and gaming platforms. We introduce ToxBuster, a simple and scalable model that reliably detects toxic content in real-time for a line of chat by including chat history and metadata. ToxBuster consistently outperforms conventional toxicity models across popular multiplayer games, including Rainbow Six Siege, For Honor, and DOTA 2. We conduct an ablation study to assess the importance of each model component and explore ToxBuster's transferability across the datasets. Furthermore, we showcase ToxBuster's efficacy in post-game moderation, successfully flagging 82.1% of chat-reported players at a precision level of 90.0%. Additionally, we show how an additional 6% of unreported toxic players can be proactively moderated.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. ADL. 2021. Online hate and harassment: The american experience 2021. Anti-Defamation League.
  2. ADL. 2022. Hate and harassment in online games. Anti-Defamation League.
  3. Fair Play Alliance. 2020. Disruption and harms in online gaming framework. Fair Play Alliance, pages 1–48.
  4. Provoke: Toxicity trigger detection in conversations from the top 100 subreddits. Data and Information Management, page 100019.
  5. Language (technology) is power: A critical survey of “bias” in NLP. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5454–5476, Online. Association for Computational Linguistics.
  6. Hate speech on facebook. pages 425–433. Fourth European Conference on Social Media Research.
  7. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR, abs/1810.04805.
  8. Gendered hate speech in youtube and younow comments: Results of two content analyses. Studies in Communication and Media, 9:62–88.
  9. Elise Fehn Unsvåg and Björn Gambäck. 2018. The effects of user features on Twitter hate speech detection. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pages 75–85, Brussels, Belgium. Association for Computational Linguistics.
  10. Björn Gambäck and Utpal Kumar Sikdar. 2017. Using convolutional neural networks to classify hate-speech. In Proceedings of the First Workshop on Abusive Language Online, pages 85–90, Vancouver, BC, Canada. Association for Computational Linguistics.
  11. Lei Gao and Ruihong Huang. 2017. Detecting online hate speech using context aware models. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 260–266, Varna, Bulgaria. INCOMA Ltd.
  12. Convolutional neural networks for toxic comment classification. In Proceedings of the 10th Hellenic Conference on Artificial Intelligence, SETN ’18, New York, NY, USA. Association for Computing Machinery.
  13. Laura Hanu and Unitary team. 2020. Detoxify. Github. https://github.com/unitaryai/detoxify.
  14. Toxicity detection for indic multilingual social media content. CoRR, abs/2201.00598.
  15. Svetlana Kiritchenko and Saif Mohammad. 2018. Examining gender and race bias in two hundred sentiment analysis systems. In Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, pages 43–53, New Orleans, Louisiana. Association for Computational Linguistics.
  16. A new generation of perspective api: Efficient multilingual character-level transformers. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22, page 3197–3207, New York, NY, USA. Association for Computing Machinery.
  17. Robert Lewington. 2021. Being ‘targeted’ about content moderation:. Fair Play Alliance, pages 1–21.
  18. Improving contextual language models for response retrieval in multi-turn conversation. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’20, page 1805–1808, New York, NY, USA. Association for Computing Machinery.
  19. Brendan Maher. 2016. Can a video game company tame toxic behaviour? Nature News, 531:568.
  20. On measuring social biases in sentence encoders. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 622–628, Minneapolis, Minnesota. Association for Computational Linguistics.
  21. An empirical survey of the effectiveness of debiasing techniques for pre-trained language models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1878–1898, Dublin, Ireland. Association for Computational Linguistics.
  22. The impact of toxic language on the health of reddit communities. pages 51–56. Canadian Conference on Artificial Intelligence.
  23. Abusive language detection on Arabic social media. In Proceedings of the First Workshop on Abusive Language Online, pages 52–56, Vancouver, BC, Canada. Association for Computational Linguistics.
  24. Ji Ho Park and Pascale Fung. 2017. One-step and two-step classification for abusive language detection on Twitter. In Proceedings of the First Workshop on Abusive Language Online, pages 41–45, Vancouver, BC, Canada. Association for Computational Linguistics.
  25. Toxicity detection: Does context really matter? In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4296–4305, Online. Association for Computational Linguistics.
  26. Playing against hate speech -how teens see hate speech in video games and online gaming communities. Journal of Digital Media and Interaction, 3:34–52.
  27. Challenges for toxic comment classification: An in-depth error analysis. In Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pages 33–42, Brussels, Belgium. Association for Computational Linguistics.
  28. Hate speech on twitter: A pragmatic approach to collect hateful and offensive expressions and perform hate speech detection. IEEE Access, 6:13825–13835.
  29. CONDA: a CONtextual dual-annotated dataset for in-game toxicity understanding and detection. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2406–2416, Online. Association for Computational Linguistics.
  30. Hate speech and counter speech detection: Conversational context does matter. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5918–5930, Seattle, United States. Association for Computational Linguistics.
  31. Measuring and characterizing hate speech on news’ websites. In 12th ACM Conference on Web Science, WebSci ’20, page 125–134, New York, NY, USA. Association for Computing Machinery.
  32. Content-driven detection of cyberbullying on the instagram social network. In International Joint Conference on Artificial Intelligence.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Zachary Yang (9 papers)
  2. Nicolas Grenan-Godbout (1 paper)
  3. Reihaneh Rabbany (48 papers)
Citations (2)