Detecting Hate Speech in Multi-modal Memes (2012.14891v1)

Published 29 Dec 2020 in cs.CV and cs.LG

Abstract: In the past few years, there has been a surge of interest in multi-modal problems, from image captioning to visual question answering and beyond. In this paper, we focus on hate speech detection in multi-modal memes wherein memes pose an interesting multi-modal fusion problem. We aim to solve the Facebook Meme Challenge \cite{kiela2020hateful} which aims to solve a binary classification problem of predicting whether a meme is hateful or not. A crucial characteristic of the challenge is that it includes "benign confounders" to counter the possibility of models exploiting unimodal priors. The challenge states that the state-of-the-art models perform poorly compared to humans. During the analysis of the dataset, we realized that majority of the data points which are originally hateful are turned into benign just be describing the image of the meme. Also, majority of the multi-modal baselines give more preference to the hate speech (language modality). To tackle these problems, we explore the visual modality using object detection and image captioning models to fetch the "actual caption" and then combine it with the multi-modal representation to perform binary classification. This approach tackles the benign text confounders present in the dataset to improve the performance. Another approach we experiment with is to improve the prediction with sentiment analysis. Instead of only using multi-modal representations obtained from pre-trained neural networks, we also include the unimodal sentiment to enrich the features. We perform a detailed analysis of the above two approaches, providing compelling reasons in favor of the methodologies used.

Citations (56)

View on Semantic Scholar

Summary

The paper demonstrates that combining image captioning with sentiment analysis significantly improves the detection of hate speech in multi-modal memes.
It employs a dual-method approach to integrate visual context and textual sentiment, effectively countering benign confounders in meme classification.
Empirical results on the Facebook Hateful Memes dataset show notable gains in AUCROC and accuracy, underscoring the method’s effectiveness.

Detecting Hate Speech in Multi-modal Memes

The task of detecting hate speech within multi-modal memes, as presented in this paper by Das, Wahi, and Li, discusses a pivotal challenge encountered when content integrates both text and imagery. Specifically, the paper addresses the complications faced in the Facebook Meme Challenge, a binary classification problem aiming at discerning hateful memes from benign ones. The dataset for this challenge is meticulously crafted to include "benign confounders," which are designed to test a model's ability to distinguish true hateful content from superficially altered non-hateful content, thus thwarting models that rely heavily on unimodal priors.

In their pursuit to overcome these impediments, the authors explore advanced utilization of the visual modality through object detection and image captioning methodologies as a means to capture the contextual semantics of images more comprehensively. Concurrently, they enrich the multi-modal representation by incorporating unimodal sentiment analysis to fortify the classification task.

The methodologies proposed comprise two distinct yet integrative approaches:

Image Captioning: The authors leverage pre-trained models to generate actual captions from meme images. These captions, combined with existing multi-modal representations, help counteract the benign textual confounders, aligning the deeper relationship between visual and textual data in the classification model.
Sentiment Analysis: By incorporating sentiment information extracted from both text and visuals, the model capitalizes on this additive feature enrichment, aiming to refine the predictive accuracy of detecting hateful content.

The research underlines that existing models such as VisualBERT, although adept at managing inputs from both language and image modalities, often fall short when confronted with nuanced confounding samples that necessitate a deeper multi-modal contextual understanding. The paper extends contributions to the domain by revealing that an enriched model embedding, considering both sentiment and content-based features, offers measurable improvements in performance metrics, notably the AUCROC and classification accuracy.

Empirical evaluations reveal that the proposed models, when applied to the Facebook Hateful Memes Challenge dataset, display a more robust classification capability by addressing adversarial examples effectively. The sentiment analysis approach, while showing substantial progress in accuracy, primarily aids in scenarios where the textual and visual modalities present contrasting sentiments, or where both are distinctly positive, indicating benign memes.

The paper's implications are twofold. Practically, the methodologies developed enhance the robustness of models in identifying hate speech in memes, a pressing real-world problem on global social media platforms. Theoretically, these approaches broaden the scope of multimodal learning by illustrating effective strategies for the fusion of differing modalities through enhanced contextual understanding and sentiment correlation.

Looking forward, the authors suggest pathways for further research, including the improvement of fusion techniques and the exploration of large-scale pre-trained models like UNITER. Such advancements might offer even more substantial gains in multimodal contextual comprehension, potentially addressing limitations observed due to redundant feature inclusion or conflicts arising from differing representation alignments.

Overall, this research presents substantial progress toward solving the multifaceted problem of hate speech detection in memes, a complex task that demands innovative approaches combining various modalities and advanced machine learning constructs.

PDF Markdown

Related Papers

YouTube

Show All Videos