SLM-Mod: Small Language Models Surpass LLMs at Content Moderation (2410.13155v1)

Published 17 Oct 2024 in cs.CL

Abstract: LLMs have shown promise in many natural language understanding tasks, including content moderation. However, these models can be expensive to query in real-time and do not allow for a community-specific approach to content moderation. To address these challenges, we explore the use of open-source small LLMs (SLMs) for community-specific content moderation tasks. We fine-tune and evaluate SLMs (less than 15B parameters) by comparing their performance against much larger open- and closed-sourced models. Using 150K comments from 15 popular Reddit communities, we find that SLMs outperform LLMs at content moderation -- 11.5% higher accuracy and 25.7% higher recall on average across all communities. We further show the promise of cross-community content moderation, which has implications for new communities and the development of cross-platform moderation techniques. Finally, we outline directions for future work on LLM based content moderation. Code and links to HuggingFace models can be found at https://github.com/AGoyal0512/SLM-Mod.

Authors (5)

Xianyang Zhan (3 papers)
Agam Goyal (9 papers)
Yilun Chen (48 papers)
Eshwar Chandrasekharan (16 papers)
Koustuv Saha (26 papers)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - AGoyal0512/SLM-Mod: SLM-Mod: Small Language Models Surpass LLMs at Content Moderation

Tweets

https://twitter.com/eshwar_chan/status/1915901700743041303

SLM-Mod: Small Language Models Surpass LLMs at Content Moderation (2410.13155v1)

Summary

Related Papers

GitHub

Tweets