Evolver: Chain-of-Evolution Prompting to Boost Large Multimodal Models for Hateful Meme Detection (2407.21004v3)

Published 30 Jul 2024 in cs.CL and cs.CV

Abstract: Recent advances show that two-stream approaches have achieved outstanding performance in hateful meme detection. However, hateful memes constantly evolve as new memes emerge by fusing progressive cultural ideas, making existing methods obsolete or ineffective. In this work, we explore the potential of Large Multimodal Models (LMMs) for hateful meme detection. To this end, we propose Evolver, which incorporates LMMs via Chain-of-Evolution (CoE) Prompting, by integrating the evolution attribute and in-context information of memes. Specifically, Evolver simulates the evolving and expressing process of memes and reasons through LMMs in a step-by-step manner. First, an evolutionary pair mining module retrieves the top-k most similar memes in the external curated meme set with the input meme. Second, an evolutionary information extractor is designed to summarize the semantic regularities between the paired memes for prompting. Finally, a contextual relevance amplifier enhances the in-context hatefulness information to boost the search for evolutionary processes. Extensive experiments on public FHM, MAMI, and HarM datasets show that CoE prompting can be incorporated into existing LMMs to improve their performance. More encouragingly, it can serve as an interpretive tool to promote the understanding of the evolution of social memes. Homepage

References (26)

Citations (2)

View on Semantic Scholar

Summary

The paper introduces a novel chain-of-evolution framework that integrates evolutionary pair mining, information extraction, and contextual relevance amplification to improve hateful meme detection.
It demonstrates significant improvements in detection accuracy and AUC by leveraging large multimodal models for nuanced analysis of evolving hateful content.
The approach enhances model interpretability and offers actionable insights for real-time mitigation of harmful meme propagation on digital platforms.

Enhancing Hateful Meme Detection through Chain-of-Evolution Prompting

The detection of hateful memes is a challenging task at the intersection of natural language processing and computer vision, exacerbated by the dynamic and evolving nature of memes. This paper proposes a novel approach, Evolver, utilizing Chain-of-Evolution (CoE) prompting to improve the performance of Large Multimodal Models (LMMs) in identifying hateful memes. This methodology is particularly relevant due to the fluid nature of meme culture, which incessantly integrates new ideas and cultural symbols, often rendering existing detection methods inadequate.

Methodological Framework

Evolver introduces a systematic approach to incorporate evolutionary dynamics into the hateful meme detection workflow, structured around three main components: evolutionary pair mining, evolutionary information extraction, and contextual relevance amplification.

Evolutionary Pair Mining: This component identifies memes evolved by integrating previous memes or cultural concepts. By employing a curated set of memes, the method retrieves the top-K most similar memes to the target meme using textual and visual embeddings. This step is crucial for understanding how memes have historically morphed into hateful content.
Evolutionary Information Extraction: This step leverages LMMs to summarize semantic regularities between paired memes. Through strategic prompt design, it extracts characteristics of hatefulness, utilizing instructions that reflect specific hateful content guidelines.
Contextual Relevance Amplifier: By enhancing the focus on hateful components and combining evolutionary insights, this element strengthens the model's detection capabilities, ensuring that nuanced and contextually embedded hatefulness is not overlooked.

Empirical Analysis

Comprehensive experiments on publicly available datasets, including FHM, MAMI, and HarM, demonstrate that incorporating CoE prompting into LMMs substantially enhances their performance in hateful meme detection. Noteworthy improvements in accuracy and AUC metrics were observed, illustrating Evolver's efficacy over traditional two-stream methods and baseline LMMs. Notably, the introduction of the CoE framework not only boosts detection accuracy but also imbues the models with interpretative capabilities, facilitating a deeper understanding of meme evolution.

Theoretical and Practical Implications

The implications of this paper are twofold. Theoretically, the integration of chain-of-evolution reasoning into multimodal models enriches our understanding of how memes propagate and transform in digital cultures. This approach opens avenues for future research in the comprehensive modeling of cultural evolution within machine learning frameworks.

Practically, Evolver offers a robust tool for platforms seeking to mitigate the spread of harmful content. By enhancing detection models with evolutionary reasoning, social media companies can better anticipate and flag emerging forms of hate speech embedded within meme culture.

Future Directions

While Evolver marks a significant step in hateful meme detection, further research is warranted to refine these models. Future studies might explore adaptive learning mechanisms that update evolutionary pair mining and semantic extraction processes in real-time, reflecting the ever-changing landscape of internet memes. Additionally, improving the granularity of visual-textual alignment and diversifying the curated meme datasets could enhance the robustness and cultural relevance of LMMs.

In conclusion, Evolver addresses the complex challenge of hateful meme detection by bridging static detection techniques with dynamic, evolution-aware methodologies. Its application of chain-of-evolution prompting marks a transformative approach, offering both a more effective detection mechanism and a lens through which to view the cultural processes underlying meme dissemination.

PDF Markdown

Related Papers

YouTube

Show All Videos