Papers
Topics
Authors
Recent
Search
2000 character limit reached

Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models

Published 9 Dec 2023 in cs.CL | (2312.05434v1)

Abstract: The age of social media is rife with memes. Understanding and detecting harmful memes pose a significant challenge due to their implicit meaning that is not explicitly conveyed through the surface text and image. However, existing harmful meme detection approaches only recognize superficial harm-indicative signals in an end-to-end classification manner but ignore in-depth cognition of the meme text and image. In this paper, we attempt to detect harmful memes based on advanced reasoning over the interplay of multimodal information in memes. Inspired by the success of LLMs on complex reasoning, we first conduct abductive reasoning with LLMs. Then we propose a novel generative framework to learn reasonable thoughts from LLMs for better multimodal fusion and lightweight fine-tuning, which consists of two training stages: 1) Distill multimodal reasoning knowledge from LLMs; and 2) Fine-tune the generative framework to infer harmfulness. Extensive experiments conducted on three meme datasets demonstrate that our proposed approach achieves superior performance than state-of-the-art methods on the harmful meme detection task.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. arXiv preprint arXiv:2302.04023.
  2. A global pandemic in the time of viral memes: Covid-19 vaccine misinformation and disinformation on tiktok. Human vaccines & immunotherapeutics, 17(8):2373–2377.
  3. Knowledge distillation: A good teacher is patient and consistent. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10925–10934.
  4. Language models are few-shot learners. In Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 1877–1901.
  5. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 535–541.
  6. Prompting for multimodal hateful meme classification. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 321–332.
  7. Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.
  8. Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
  9. Detecting hate speech in multi-modal memes. arXiv preprint arXiv:2012.14891.
  10. Ernest Davis and Gary Marcus. 2015. Commonsense reasoning and commonsense knowledge in artificial intelligence. Communications of the ACM, 58(9):92–103.
  11. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186.
  12. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778.
  13. On explaining multimodal hateful meme detection models. In Proceedings of the ACM Web Conference 2022, pages 3651–3655.
  14. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
  15. Large language models are reasoning teachers. arXiv preprint arXiv:2212.10071.
  16. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1–38.
  17. Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:1909.02950.
  18. The hateful memes challenge: detecting hate speech in multimodal memes. In Proceedings of the 34th International Conference on Neural Information Processing Systems, pages 2611–2624.
  19. Segment anything. arXiv preprint arXiv:2304.02643.
  20. Large language models are zero-shot reasoners. In ICML 2022 Workshop on Knowledge Retrieval and Language Models.
  21. Mmocr: a comprehensive toolbox for text detection, recognition and understanding. In Proceedings of the 29th ACM International Conference on Multimedia, pages 3791–3794.
  22. Disentangling hate in online memes. In Proceedings of the 29th ACM International Conference on Multimedia, pages 5138–5147.
  23. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557.
  24. Detect rumors in microblog posts for low-resource domains via adversarial contrastive learning. In Findings of the Association for Computational Linguistics: NAACL 2022, pages 2543–2556.
  25. Rumor detection on twitter with claim-guided hierarchical graph attention networks. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 10035–10047.
  26. Zero-shot rumor detection with propagation structure via prompt learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 5213–5221.
  27. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer.
  28. A multimodal framework for the detection of hateful memes. arXiv preprint arXiv:2012.12871.
  29. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692.
  30. Vilbert: pretraining task-agnostic visiolinguistic representations for vision-and-language tasks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, pages 13–23.
  31. I-tuning: Tuning frozen language models with image for lightweight image captioning. ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
  32. Jing Ma and Wei Gao. 2020. Debunking rumors on twitter with tree transformer. In Proceedings of the 28th International Conference on Computational Linguistics, pages 5455–5466.
  33. An attention-based rumor detection model with tree-structured recursive neural networks. ACM Transactions on Intelligent Systems and Technology (TIST), 11(4):1–28.
  34. Teaching small language models to reason. arXiv preprint arXiv:2212.08410.
  35. Clipcap: Clip prefix for image captioning. arXiv preprint arXiv:2111.09734.
  36. Niklas Muennighoff. 2020. Vilio: State-of-the-art visio-linguistic models applied to hateful memes. arXiv preprint arXiv:2012.07788.
  37. Show your work: Scratchpads for intermediate computation with language models. arXiv preprint arXiv:2112.00114.
  38. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35:27730–27744.
  39. Paul R Pintrich and Dale H Schunk. 2002. Motivation in education: Theory, research, and applications. Prentice Hall.
  40. Detecting harmful memes and their targets. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2783–2796.
  41. Momenta: A multimodal framework for detecting harmful memes and their targets. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 4439–4455.
  42. Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763.
  43. Scaling language models: Methods, analysis & insights from training gopher. arXiv preprint arXiv:2112.11446.
  44. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
  45. Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6):1137–1149.
  46. Vlad Sandulescu. 2020. Detecting hateful memes using a multimodal deep ensemble. arXiv preprint arXiv:2012.13235.
  47. Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2556–2565.
  48. Detecting and understanding harmful memes: A survey. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, pages 5597–5606.
  49. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
  50. Multimodal meme dataset (multioff) for identifying offensive content in image and text. In Proceedings of the second workshop on trolling, aggression and cyberbullying, pages 32–41.
  51. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2, pages 3104–3112.
  52. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239.
  53. Attention is all you need. In NIPS.
  54. Riza Velioglu and Jewgeni Rose. 2020. Detecting hate speech in memes using multimodal deep learning approaches: Prize-winning solution to hateful memes challenge. arXiv preprint arXiv:2012.12975.
  55. Pinto: Faithful language reasoning using prompt-generated rationales. arXiv preprint arXiv:2211.01562.
  56. Chain-of-thought prompting elicits reasoning in large language models. In Advances in Neural Information Processing Systems.
  57. Automatic chain of thought prompting in large language models. arXiv preprint arXiv:2210.03493.
  58. Multimodal learning for hateful memes detection. In 2021 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pages 1–6. IEEE.
  59. Multimodal zero-shot hateful meme detection. In 14th ACM Web Science Conference 2022, pages 382–389.
  60. Ron Zhu. 2020. Enhance multimodal transformer with external label and in-domain pretrain: Hateful meme challenge winning solution. arXiv preprint arXiv:2012.08290.
Citations (7)

Summary

  • The paper introduces Mr.Harm, which uses multimodal reasoning distilled from LLMs to detect harmful memes with enhanced accuracy.
  • It employs a two-stage training approach combining abductive reasoning and reasoning distillation to extract deep semantic cues from both text and images.
  • Empirical results show significant macro-F1 score improvements on datasets like Harm-C, Harm-P, and FHM compared to traditional detection methods.

Unveiling Harmful Memes with Multimodal Reasoning

The paper "Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from LLMs" presents an innovative approach for detecting harmful memes. It leverages multimodal reasoning distilled from LLMs to capture the implicit meaning of memes, aiming to improve detection performance over traditional methods that rely heavily on superficial signals in images and text.

Introduction

Social media's proliferation has magnified the role of memes as powerful vehicles for communication, often overshadowing their potential to disseminate harm through subtle imagery and text interplay. Existing methods predominantly use end-to-end classification that fails to delve deeply into the semantic nuances required to distinguish harmful from harmless content. To address this, the authors propose a method named Mr.Harm, integrating advanced reasoning capabilities from LLMs. They utilize a two-stage framework for multimodal reasoning: extracting reasoning knowledge from LLMs and fine-tuning smaller LLMs for practical deployment.

Methodology

Multimodal Reasoning Framework

The method begins with prompting LLMs for abductive reasoning, training them to generate rationales that elucidate whether a meme is harmful based on integrated text and image cues. These rationales capture complex contextual and cultural information, which is otherwise inaccessible to simpler models focused on classification alone. Figure 1

Figure 1: The overall pipeline of our method. We first conduct abductive reasoning with LLMs to extract harmfulness rationales using meme text and image captions.

Generative Framework

The proposed generative model is split into two distinct training stages:

  1. Reasoning Distillation - Fine-tunes a smaller LLM to absorb reasoning paths from LLMs, thereby grounding it in rich multimodal representations for robust detection.
  2. Harmfulness Inference - Utilizes distilled multimodal knowledge to generate final harmfulness predictions for given meme content.

Experimental Results

Empirical evaluation is conducted on three public meme datasets—Harm-C, Harm-P, and FHM, with Mr.Harm demonstrating considerable performance gains. Notably, it achieves substantial improvements in macro-F1 scores, especially in datasets where traditional models struggle due to the nuanced nature of memes. Figure 2

Figure 2: Examples of correctly predicted harmful memes in (a) Harm-C, (b) Harm-P, and (c) FHM dataset.

Ablation Studies

A detailed ablation study underscores the significance of each component, emphasizing the critical role of multimodal reasoning distilled from LLMs. Removing elements like reasoning distillation or fine-tuning results in substantial performance drops, highlighting the robustness of the proposed method.

Error Analysis

The paper acknowledges areas for improvement, specifically regarding the misrecognition of images necessitating extensive background knowledge. This suggests opportunities for enhancement via the integration of comprehensive datasets and more sophisticated visual representations. Figure 3

Figure 3: Examples of wrongly predicted memes by our proposed framework with the ground truth (a) harmful and (b) harmless.

Discussion

The paper opens avenues for future exploration into explainability and generalization of harmful meme detection frameworks. It suggests incorporating visual LLMs to enrich visual features and improve the distillation of multimodal reasoning.

Conclusion

The research offers a significant advancement in harmful meme detection by moving beyond surface-level interpretation, employing LLMs for a comprehensive understanding of meme semantics. This approach not only improves detection accuracy but also provides a foundation for developing more robust AI systems capable of nuanced multimodal reasoning. Figure 4

Figure 4: The details of our Multimodal Fusion module.

The implications of this work resonate in broader AI applications, particularly in monitoring and counteracting disinformation on digital platforms. Future efforts will likely focus on enhancing explanaibaility and refining multimodal synergies to fully leverage LLM capabilities.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.