Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MATK: The Meme Analytical Tool Kit (2312.06094v1)

Published 11 Dec 2023 in cs.CL, cs.CV, and cs.MM

Abstract: The rise of social media platforms has brought about a new digital culture called memes. Memes, which combine visuals and text, can strongly influence public opinions on social and cultural issues. As a result, people have become interested in categorizing memes, leading to the development of various datasets and multimodal models that show promising results in this field. However, there is currently a lack of a single library that allows for the reproduction, evaluation, and comparison of these models using fair benchmarks and settings. To fill this gap, we introduce the Meme Analytical Tool Kit (MATK), an open-source toolkit specifically designed to support existing memes datasets and cutting-edge multimodal models. MATK aims to assist researchers and engineers in training and reproducing these multimodal models for meme classification tasks, while also providing analysis techniques to gain insights into their strengths and weaknesses. To access MATK, please visit \url{https://github.com/Social-AI-Studio/MATK}.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Prompting for multimodal hateful meme classification. arXiv preprint arXiv:2302.04156 (2023).
  2. SemEval-2022 Task 5: Multimedia automatic misogyny identification. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022). 533–549.
  3. Are you a hero or a villain? A semantic role labelling approach for detecting harmful memes.. In Proceedings of the Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situations. 19–23.
  4. The hateful memes challenge: Detecting hate speech in multimodal memes. Advances in Neural Information Processing Systems 33 (2020), 2611–2624.
  5. Disentangling hate in online memes. In Proceedings of the 29th ACM International Conference on Multimedia. 5138–5147.
  6. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019).
  7. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. arXiv preprint arXiv:2301.12597 (2023).
  8. Visualbert: A simple and performant baseline for vision and language. arXiv preprint arXiv:1908.03557 (2019).
  9. Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems 30 (2017).
  10. Findings of the WOAH 5 shared task on fine grained hateful memes detection. In Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). 201–206.
  11. MMEditing Contributors. 2022. MMEditing: OpenMMLab Image and Video Editing Toolbox. https://github.com/open-mmlab/mmediting.
  12. Clipcap: Clip prefix for image captioning. arXiv preprint arXiv:2111.09734 (2021).
  13. Niklas Muennighoff. 2020. Vilio: State-of-the-art visio-linguistic models applied to hateful memes. arXiv preprint arXiv:2012.07788 (2020).
  14. TotalDefMeme: A Multi-Attribute Meme dataset on Total Defence in Singapore. In Proceedings of the 14th ACM Multimedia Systems Conference (MMSys ’23). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3587819.3592545
  15. Detecting harmful memes and their targets. arXiv preprint arXiv:2110.00413 (2021).
  16. MOMENTA: A multimodal framework for detecting harmful memes and their targets. arXiv preprint arXiv:2109.05184 (2021).
  17. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.
  18. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
  19. Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems 28 (2015).
  20. ” Why should i trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD.
  21. Vlad Sandulescu. 2020. Detecting hateful memes using a multimodal deep ensemble. arXiv preprint arXiv:2012.13235 (2020).
  22. SemEval-2020 Task 8: Memotion Analysis–The Visuo-Lingual Metaphor! arXiv preprint arXiv:2008.03781 (2020).
  23. Detecting and understanding harmful memes: A survey. arXiv preprint arXiv:2205.04274 (2022).
  24. Flava: A foundational language and vision alignment model. In CVPR. 15638–15650.
  25. Axiomatic attribution for deep networks. In International conference on machine learning. PMLR, 3319–3328.
  26. Multimodal meme dataset (MultiOFF) for identifying offensive content in image and text. In Proceedings of the second workshop on trolling, aggression and cyberbullying. 32–41.
  27. Hao Tan and Mohit Bansal. 2019. Lxmert: Learning cross-modality encoder representations from transformers. arXiv preprint arXiv:1908.07490 (2019).
  28. MET-Meme: A multimodal meme dataset rich in metaphors. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2887–2899.
  29. Multimodal Hate Speech Detection via Cross-Domain Knowledge Transfer. In Proceedings of the 30th ACM International Conference on Multimedia. 4505–4514.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com