Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 22 tok/s Pro
GPT-4o 93 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 426 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Bengali Hate Speech Detection Research

Updated 26 October 2025
  • Bengali hate speech detection is defined by automated identification, categorization, and explanation of hateful content through rich, multi-label annotated datasets.
  • Advanced models like BanglaBERT, transformer-based architectures, and parameter-efficient tuning demonstrate significant improvements in detection accuracy.
  • Specialized preprocessing, explainability techniques, and error analysis address challenges posed by code-mixing, dialectal variation, and transliteration in Bengali texts.

Bengali hate speech detection is an active research field at the intersection of NLP, deep learning, and social computing, aiming to automatically identify, categorize, and explain various forms of hateful or offensive content originating in the Bengali language and its code-mixed or transliterated forms. The ecosystem is characterized by a proliferation of specialized datasets, multi-label annotation schemes, and the adaptation of state-of-the-art neural architectures, all driven by the acute sociotechnical challenge posed by hate speech in online Bengali discourse.

1. Dataset Development: Scope and Granularity

The last five years have witnessed the emergence of large, expertly annotated Bengali hate speech datasets covering a wide spectrum of sources, linguistic varieties, and annotation schemes. Early works constructed datasets through bootstrapping with slur lexicons and subsequent manual annotation—e.g., (Karim et al., 2020) assembled a 35,000 statement dataset divided across Political, Religious, Gender Abusive, Geopolitical, and Personal categories with rigorous cleaning and normalization, establishing a paradigm for fine-grained dataset construction. Recent corpora, such as BD-SHS (Romim et al., 2022), expand to over 50,000 examples, introduce hierarchical, multilabel annotations (e.g., identifying targets and types of hate), and strive for balance and coverage of diverse social contexts. The BanTH dataset (Haider et al., 17 Oct 2024) introduces multi-label annotation for transliterated Bangla (Romanized), while BIDWESH (Fayaz et al., 22 Jul 2025) and BOISHOMMO (Kafi et al., 11 Apr 2025) contribute dialectal and multifaceted hate speech corpora, respectively, broadening linguistic and topical inclusiveness.

An overview of select dataset characteristics:

Dataset Samples Labels/Classes Annotation Notable Features
(Karim et al., 2020) 35,000 5 HS types Manual, majority N-gram filtering, bootstrapping, POS, normalization
BD-SHS (Romim et al., 2022) 50,281 Binary + multilabel (Target/Type) Hierarchical, iterative Informal embeddings, task splits, multi-domain
BanTH (Haider et al., 17 Oct 2024) 37,350 Multi-label (Transliterated) Multi-annotator + expert LLM and translation-based evaluation
BOISHOMMO (Kafi et al., 11 Apr 2025) 2,499 10 HS attributes (multi-label) Majority vote Non-Latin script, Cohen’s κ analysis
BIDWESH (Fayaz et al., 22 Jul 2025) 9,183 4 Type × 4 Target (Dialectal) Native dialect experts Chittagong, Barishal, Noakhali

This breadth reflects an evolving consensus: high-quality Bengali hate speech detection demands both large-scale, linguistically diverse, and contextually granular annotated corpora.

2. Model Architectures: Classical, Deep, and Large-scale Approaches

Early efforts relied on classical feature engineering—SVMs with TF–IDF or n-gram inputs—but deep learning approaches rapidly became dominant. The multichannel convolutional-LSTM network (MConv-LSTM) (Karim et al., 2020) integrates convolutional filters (capturing local, n-gram patterns) and a parallel LSTM (modeling sentence-level dependencies), outperforming classical baselines by 7+ F1 points. Informal social-media-trained word embeddings (e.g., IFT, informal FastText SG) are repeatedly shown to outperform formal news/wiki-based embeddings (Romim et al., 2021, Romim et al., 2022), likely due to better handling of noisy, code-mixed, or dialectal speech. Bi-LSTM with informal embeddings achieves F1 ≈ 87%.

The field has shifted decisively toward transformer-based architectures:

  • Monolingual models: BanglaBERT consistently achieves strong performance (>0.75 F1), especially for multi-task or nuanced classification (Narayan et al., 2023, Hasan et al., 2 Oct 2025).
  • Multilingual PLMs: XLM-RoBERTa and mBERT, pre-trained on diverse languages and scripts, show robust transferability and remain competitive or state-of-the-art when adapted via further pretraining or task-specific finetuning, as in (Mim et al., 2023, Haider et al., 17 Oct 2024).
  • LLMs and PEFT: Recent work leverages LLaMA, Mistral, and Gemma with parameter-efficient adapters (LoRA/QLoRA), demonstrating strong F1 (up to 92%) while restricting finetuning to <1% of model parameters (Islam et al., 19 Oct 2025).

Architectural innovation now often occurs at the intersection of data-centric adaptation (domain-specific pretraining, e.g., transliterated corpora (Haider et al., 17 Oct 2024)), efficient finetuning (PEFT/QLoRA (Islam et al., 19 Oct 2025)), and prompt engineering for zero/few-shot LLM evaluation (Prome et al., 30 Jun 2025).

3. Features, Preprocessing, and Embedding Choices

Preprocessing pipelines are highly specialized due to Bengali’s rich morphology, frequent code-mixing, and spelling variation. Effective preprocessing includes:

  • Aggressive normalization (removing extraneous characters, replacing proper nouns, hashtag normalization)
  • Linguistically motivated stemming and token filtering (addressing non-Latin script complexity (Kafi et al., 11 Apr 2025))
  • Explicit modeling of slang (traditional and non-traditional) and emoji handling (Romim et al., 2021)

Feature representation insights:

  • Informal FastText skip-gram embeddings (FT(SG)) trained directly on noisy, colloquial comments exhibit a persistent performance edge.
  • Finetuning multilingual (mBERT/XLM-R) or monolingual (BanglaBERT) models on task- or domain-specific corpora is essential for handling transliterated and dialectal input (Haider et al., 17 Oct 2024, Fayaz et al., 22 Jul 2025).
  • Incorporation of emoji2vec and translation-based pre/prompts further boosts robustness in code-mixed and transliterated scenarios.

4. Evaluation Techniques, Benchmarking, and Comparative Analyses

Robust benchmarking approaches are now standard:

Recent studies include head-to-head evaluations of zero-shot prompting, multi-shot learning, and LoRA adaptation, demonstrating that even with parameter-efficient techniques, well-designed local pretraining (BanglaBERT) remains critical for subtle and adversarial tasks (Hasan et al., 2 Oct 2025, Prome et al., 30 Jun 2025).

5. Special Topics: Multi-label, Multi-task, Explainability, and Dialect

Multi-label and multi-task detection mark the new frontiers. Both BanTH (Haider et al., 17 Oct 2024) and BOISHOMMO (Kafi et al., 11 Apr 2025) use controlled schemes to annotate multiple co-occurring hate types/targets, reflecting the real-world complexity where hate often intersects race, gender, religion, etc. Multi-task datasets like BanglaMultiHate (Hasan et al., 2 Oct 2025) separate detection into type, severity, and target prediction—moving beyond binary detection to multifaceted content moderation benchmarks. Dialect inclusion, as in BIDWESH (Fayaz et al., 22 Jul 2025), addresses the under-recognition of regionalized hate, enabling more equitable and context-aware tools.

Explainability is tackled with sensitivity analysis and layer-wise relevance propagation (Karim et al., 2020). Faithfulness metrics (comprehensiveness and sufficiency) are used to score explanation quality, and, in some architectures, attention heat maps and post-hoc rationales support interpretability, a desirable property for deployment in sensitive or regulatory settings.

6. LLMs, Prompt Engineering, and Resource-Efficient Adaptation

Leading-edge research explores large-scale LLMs for Bengali hate speech detection, with a focus on parameter-efficient fine-tuning and prompt engineering:

  • Prompt engineering strategies include direct zero-shot, multi-shot, role, refusal-suppression, and metaphor prompting (the latter substituting hate speech triggers with neutral metaphors to avoid LLM safety refusals) (Prome et al., 30 Jun 2025).
  • Studies demonstrate that metaphor prompts lead to high F1—even under strict LLM safety—for Bengali and cross-lingual settings, often with lower resource and carbon footprint than traditional fine-tuning.
  • PEFT approaches (LoRA/QLoRA) prove practical for adapting LLMs like Llama-3.2-3B on single consumer GPUs, enabling F1 score improvements to over 92% on BD-SHS (Islam et al., 19 Oct 2025).
  • Fine-tuning LLMs on multi-task datasets requires careful optimization (e.g., learning rate 2×10⁻⁴, LoRA parameters α=16, r=64) and culturally grounded pretraining to rival best-in-class specialist models (Hasan et al., 2 Oct 2025).

A plausible implication is that data-centric adaptation and lightweight parameter-efficient methods offer a sustainable path for scalable hate speech detection in low-resource languages.

7. Challenges, Applications, and Ongoing Directions

Bengali hate speech detection encounters persistent challenges:

  • Data scarcity, especially for fine-grained, dialect-specific, or multi-label contexts
  • The prevalence of code-mixing, misspellings, and transliteration
  • The subjectivity and sensitivity of annotation, especially in categories like religion, gender, or class
  • The trade-off between model performance, explainability, and computational efficiency in low-resource deployment

Applications include social media moderation, real-time flagging for platforms, and policy-driven civil society tools for measuring and mitigating online toxicity. There is a strong emphasis on inclusive representations, dialectal fairness, and explainable decision-making.

Future work will likely expand dataset scope (regional, multimodal, code-mixed), further integrate LLM adaptation techniques, and prioritize robust, faithfulness-aware explainability and error analysis. Extensions to adversarial and counterfactual testing, bias detection, and multi-lingual, cross-regional policy development are also anticipated as the resource ecosystem matures.


Bengali hate speech detection is thus defined by sophisticated data resources, specialized embedding and model strategies, multi-label and multi-task evaluation, and a trajectory toward scalable, explainable, and equitable content moderation systems guided by both computational and social imperatives.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Bengali Hate Speech Detection.