Bangla Multitask Hate Speech Identification

Updated 30 November 2025

The paper introduces a multitask approach using transformer ensembles and adversarial training to robustly detect and categorize hate speech in Bangla.
Models leverage domain-specific adaptations, including fine-tuning on dialectal and transliterated data, to improve classification across multiple hate categories.
Evaluation on diverse datasets reveals enhanced performance through multi-label, multi-class, and error analysis strategies, addressing challenges like label ambiguity and class imbalance.

Bangla Multitask Hate Speech Identification refers to the detection and fine-grained categorization of hateful, offensive, or toxic content in Bangla (Bengali) language digital corpora, with an explicit focus on solving multiple related classification tasks via multi-output or multitask machine learning frameworks. This area has progressed rapidly due to the availability of new datasets covering regional dialects, multi-label and multi-class taxonomies, code-mixed and transliterated scenarios, and shared-task benchmarks, along with the introduction of transformer-based and ensemble models specifically tailored for low-resource South Asian linguistic environments.

1. Task Formulations and Dataset Landscape

Research in Bangla multitask hate speech identification operationalizes several interlinked subtasks, typically comprising:

Hate type (class/multi-label): Categorization of content into hate subtypes (e.g., abusive, political, profane, religious, sexism).
Target group identification: Classification of the intended target (e.g., individual, group, organization, society).
Severity and presence: Joint classification of hate presence (binary), severity degree, and granular target/type axes.

Leading benchmarks include:

BLP-2025 shared task dataset: 35,522 training, 2,512 dev, 2,512 dev-test, and 10,200 test samples. Label distributions in subtask 1A (hate type): None 56.2%, Abusive 23.1%, Political Hate 11.9%, Profane 6.6%, Religious Hate 1.9%, Sexism 0.3%. Target group classes (1B): None 59.7%, Individual 15.9%, Organization 10.8%, Community 7.4%, Society 6.2% (Hoque et al., 23 Nov 2025, Saha et al., 10 Nov 2025).
BIDWESH: Multi-dialectal (Barishal, Noakhali, Chittagong) corpus, 9,183 sentences derived by expert translation of 3,061 BD-SHS Bangla comments. Labels: hate presence (binary), type (slander/gender/religion/call to violence + combinations), and target (individual, male, female, group, multi-targets) (Fayaz et al., 22 Jul 2025).
BOISHOMMO: 2,499 Facebook comments, ten binary hate categories (race, behaviour, physical threat, class, religion, disability, nationality, gender, sexual orientation, political statement), multi-label per instance (Kafi et al., 11 Apr 2025).
BanTH: 37,350 transliterated Bangla YouTube comments, seven multi-label hate targets (Political, Religious, Gender, Personal Offense, Abusive/Violence, Origin, Body Shaming), enabling identification in Roman script (Haider et al., 17 Oct 2024).
Karim et al. (2020): 35,000 pure-Bangla statements for five fine-grained hate types (Political, Religious, Gender abusive, Geopolitical, Personal) (Karim et al., 2020).

This diversity facilitates benchmarking across standard Bangla, dialects, social domains, and script modalities.

2. Model Architectures and Training Paradigms

State-of-the-art systems employ advanced neural architectures, multitask heads, and various regularization/robustness techniques:

Transformer ensemble and fine-tuning: Leading shared task systems employ fine-tuned transformer LLMs (BanglaBERT, MuRIL, XLM-RoBERTa, IndicBERTv2), each with standard classification heads for each subtask. Ensembles are built via 5-fold cross-validation and aggregated by soft- or weighted voting at inference (Hoque et al., 23 Nov 2025, Saha et al., 10 Nov 2025).
Adversarial robustness: Fast Gradient Sign Method (FGSM) is applied in embedding space during training to simulate typographical and script noise, improving recall in noisy/transliterated text. Perturbations $\Delta = \epsilon \cdot \mathrm{sign}(\nabla_\Theta J(x,y;\Theta))$ , yielding combined losses $J̃(x,y) = \alpha J(x,y) + (1-\alpha) J(x+\Delta, y)$ with $\epsilon=0.1$ and $\alpha=0.5$ (Hoque et al., 23 Nov 2025).
Dialects and transliteration: Dialect adaptation uses LM-finetuning and adapter layers for each dialect. For transliterated input, domain-adapted transformer encoders are further pretrained on transliterated Bangla via masked language modeling objectives, e.g.,

$L_{\mathrm{MLM}} = -\frac{1}{|M|}\sum_{i\in M}\log P(x_i \mid \tilde X)$

as in BanTH’s TB-Encoder (Haider et al., 17 Oct 2024).

Classical multitask learning: Shared-trunk models use a single encoder with multiple parallel classification heads (e.g., for hate presence, type, and target). Losses combine via

$L = \lambda_1 L_{\text{presence}} + \lambda_2 L_{\text{type}} + \lambda_3 L_{\text{target}} + \Omega(\theta)$

where $\Omega$ is regularization, and $\lambda_i$ are task weights (Fayaz et al., 22 Jul 2025).

Traditional architectures: Random Forests, SVM (sigmoid kernel), and Logistic Regression are baseline models for smaller datasets or for rigorous analysis of feature-based approaches. Multi-label outputs are realized via independent binary classifiers per category (Kafi et al., 11 Apr 2025).

3. Evaluation Metrics and Error Analysis

Bangla multitask hate speech systems are evaluated on both instance-level and class-level accuracy and F1 scores:

Micro-averaged F1: Used for tasks with heavy class imbalance, computed as $2\sum_c TP_c/[2\sum_c TP_c+\sum_c FP_c+\sum_c FN_c]$ .
Macro-averaged F1 and AUROC: Used when equal emphasis on each class or label is required, especially for minority type/target classes; in multi-label scenarios, subset accuracy (exact match) and Hamming loss are also reported (Haider et al., 17 Oct 2024, Kafi et al., 11 Apr 2025).
Dialect granularity: Metrics are stratified by dialect in BIDWESH to uncover regional blind spots and model fragility in dialect-sensitive patterns (Fayaz et al., 22 Jul 2025).
Error profiles: Misclassification matrices reveal consistent patterns: “None” and “Profane” have the highest TPR, while “Abusive” and group-level hate labels (e.g., Community, Society in 1B) are under-recalled. Hard errors frequently manifest as implicit or group-directed hate lacking explicit markers (Hoque et al., 23 Nov 2025, Saha et al., 10 Nov 2025).
Ambiguity and label noise: Coarse-grained and highly overlapping hate categories (e.g., gender–political, religion–political), as well as fuzzy group boundaries, lower agreement and degrade generalization, as measured via kappa statistics and error cascades across multitask heads (Kafi et al., 11 Apr 2025, Fayaz et al., 22 Jul 2025).

4. Key Advances and Empirical Results

Empirical investigations underline the promise and persistent challenges:

Transformer ensemble performance: SOTA systems employing 5-fold or 3-model ensembles (BanglaBERT, MuRIL, XLM-RoBERTa/IndicBERTv2) reach up to 73.23% (1A, type) and 73.28% (1B, target) micro-F1 in BLP-2025; joint multitask heads with weighted aggregation achieve 72.62% weighted micro-F1 on the most granular (type+severity+target) task (Hoque et al., 23 Nov 2025, Saha et al., 10 Nov 2025).
Robustness to noise: FGSM-augmented fine-tuning improves micro-F1 by +0.4–0.6pp and shows highest gains for noisy/variant inputs (script/orthography, transliteration-induced misspellings). Rule-based normalization adds a further +0.5–1.0pp (Hoque et al., 23 Nov 2025).
Multilabel and dialectal detection: In multi-label settings (BOISHOMMO), Random Forest reaches 86% macro-F1, outperforming SVM (81%) and LR (60%). Highest F1 arises for lexically distinctive labels (Religion, Race), while ambiguous or metaphorical categories (Class, Political) yield lower agreement (Kafi et al., 11 Apr 2025). In dialect adaptation (BIDWESH), joint encoder–multihead architectures with transfer learning are recommended for multi-dialectal robustness (Fayaz et al., 22 Jul 2025).
Transliterated text: TB-Encoder models, further pretrained on transliterated Bangla, reach up to 77.36% binary macro-F1 (TB-mBERT) and 30.17% multi-label macro-F1 (TB-BERT), with LLM prompting (zero- or few-shot translation + explanation) yielding competitive few-shot macro-F1 (39.53%) (Haider et al., 17 Oct 2024).

System / Dataset	Type	SOTA Metric (Test)	Notable Architecture
BLP-2025 (1A/1B)	Multiclass	73.23% / 73.28% F1	5-fold ensemble + FGSM
BLP-2025 (1C)	Multitask	72.62% weighted F1	3-way multitask ensemble
BOISHOMMO	Multi-label	86% macro-F1	RF, SVM, LR baselines
BanTH (multi-label)	Multi-label	30.17% macro-F1	TB-BERT (pretrained)
BIDWESH	Multitask	-- (guide only)	BERT encoder + multihead

Performance drops significantly on class-imbalanced, subtle, or multi-label tasks, with minority categories (Sexism, Disability, multi-target) remaining underrepresented and error-prone.

5. Linguistic Challenges and Domain Adaptation

Bangla multitask hate speech identification is complicated by unique linguistic and sociolinguistic issues:

Script and orthography: Social media Bangla features code-mixing (Bangla/Latin), variable transliteration, irregular compounding, and spelling diversity. Rule-based normalization and script-aware tokenization are crucial for effective preprocessing (Hoque et al., 23 Nov 2025, Haider et al., 17 Oct 2024).
Label ambiguity: Fuzzy boundaries between hate categories (e.g., overlapping political, religious, and gender insults), context-dependence, and multi-target references exacerbate annotation and system errors (Kafi et al., 11 Apr 2025, Fayaz et al., 22 Jul 2025).
Low-resource and dialectal variation: Regional dialects vary lexically and pragmatically, with limited annotated data. Data augmentation (back-translation, synonym replacement), pretraining on in-domain dialect corpora, and dialect-specific adapters are proposed for robust generalization (Fayaz et al., 22 Jul 2025).
Transliteration noise: Models must handle noisy transliterated text with strong variability. Further pretraining (e.g., BanglaTLit-PT objective for TB-Encoders) and spelling normalization are effective at partially bridging this gap (Haider et al., 17 Oct 2024).

6. Methodological Recommendations and Future Research Directions

Current research identifies several methodological bottlenecks and anchors for future work:

Advanced adversarial training: Geometry-aware, weight-perturbation, or gradient-alignment adversarial schemes (PGD, AWP, GAT) may further boost robustness if computational budgets permit (Hoque et al., 23 Nov 2025).
Dynamic weighting: Optimal ensemble/model-task/rebalancing through dynamic weighted voting or task-loss calibration remains underexplored (Saha et al., 10 Nov 2025).
Augmentation and continual learning: Targeted data augmentation, especially for rare classes and dialects, and periodic integration of new user-generated content are recommended. Continual or federated learning is suggested for sustainable updating (Fayaz et al., 22 Jul 2025).
Multimodal and code-mixed detection: Extension to code-mixed text (Bangla-English Romanization, mixed script), as well as multimodal (text + audio/video) inputs, is recommended for future empirical gains (Haider et al., 17 Oct 2024).
Annotation and evaluation: Improved guidelines and adjudication protocols for complex, multi-label, and implicit hate; fine-grained confusion analysis; and dialect-specific benchmarks are needed to diagnose remaining blind spots (Kafi et al., 11 Apr 2025, Fayaz et al., 22 Jul 2025).
Model interpretability: Future work may investigate model explainability and interpretability, especially given the sociotechnical stakes in online moderation and legal compliance.

Bangla multitask hate speech identification is poised for continued advancement via transfer learning, robust augmentation, and dialectal adaptation, with clear empirical baselines and open-source code to enable reproducibility and community benchmarking (Hoque et al., 23 Nov 2025, Saha et al., 10 Nov 2025, Fayaz et al., 22 Jul 2025, Haider et al., 17 Oct 2024, Kafi et al., 11 Apr 2025, Karim et al., 2020).