MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation (2403.14652v1)
Abstract: Online memes have emerged as powerful digital cultural artifacts in the age of social media, offering not only humor but also platforms for political discourse, social critique, and information dissemination. Their extensive reach and influence in shaping online communities' sentiments make them invaluable tools for campaigning and promoting ideologies. Despite the development of several meme-generation tools, there remains a gap in their systematic evaluation and their ability to effectively communicate ideologies. Addressing this, we introduce MemeCraft, an innovative meme generator that leverages LLMs and visual LLMs (VLMs) to produce memes advocating specific social movements. MemeCraft presents an end-to-end pipeline, transforming user prompts into compelling multimodal memes without manual intervention. Conscious of the misuse potential in creating divisive content, an intrinsic safety mechanism is embedded to curb hateful meme production.
- Pro-Cap: Leveraging a Frozen Vision-Language Model for Hateful Meme Detection. Proceedings of the 31st ACM International Conference on Multimedia (2023).
- Prompting for Multimodal Hateful Meme Classification. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. 321–332.
- Language models are few-shot learners for prognostic prediction. arXiv preprint arXiv:2302.12692 (Feb. 2023). arXiv:2302.12692 [cs.CL]
- InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning. arXiv:2305.06500 [cs.CV]
- Detecting propaganda techniques in memes. The Association for Computational Linguistics (ACL) (2021).
- Understanding visual memes: An empirical analysis of text superimposed on memes shared on twitter. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 153–164.
- Sustainable Development Goals: A need for relevant indicators. Ecological indicators 60 (2016), 565–573.
- Decoding the Underlying Meaning of Multimodal Hateful Memes. The International Joint Conference on Artificial Intelligence(IJCAI) (2023).
- On explaining multimodal hateful meme detection models. In Proceedings of the ACM Web Conference 2022. 3651–3655.
- Paulo Cezar de Q Hermida and Eulanda M dos Santos. 2023. Detecting hate speech in memes: a review. Artificial Intelligence Review (2023), 1–19.
- Supervised multimodal bitransformers for classifying images and text. arXiv preprint arXiv:1909.02950 (2019).
- The hateful memes challenge: Detecting hate speech in multimodal memes. Conference on Neural Information Processing Systems(NeurIPS 33 (2020), 2611–2624.
- Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset. In Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). 26–35.
- Anushka Kulkarni. 2017. Internet meme and Political Discourse: A study on the impact of internet meme as a tool in communicating political satire. Journal of Content, Community & Communication Amity School of Communication 6 (2017).
- The Ethics of Interaction: Mitigating Security Threats in LLMs. (2024). arXiv:2401.12273 [cs.CR]
- Disentangling hate in online memes. In Proceedings of the 29th ACM International Conference on Multimedia. 5138–5147.
- Llava-med: Training a large language-and-vision assistant for biomedicine in one day. (2023). arXiv:arXiv:2306.00890 [cs.CL]
- Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. Proceedings of the 40th International Conference on Machine Learning(ICML) (2023). arXiv:arXiv:2301.12597 [cs.CL]
- Visual instruction tuning. Conference on Neural Information Processing Systems(NeurIPS) (2023).
- Findings of the WOAH 5 shared task on fine grained hateful memes detection. In Proceedings of the 5th Workshop on Online Abuse and Harms (WOAH 2021). 201–206.
- Memegenerator.net. 2018. Meme Generator. http://www.memegenerator.net/
- Mia Moody-Ramirez and Andrew B Church. 2019. Analysis of Facebook meme groups used during the 2016 US presidential election. Social Media+ Society 5, 1 (2019), 2056305118808799.
- AE Msugheter. 2020. Internet meme as a campaign tool to the fight against Covid-19 in Nigeria. Global Journal of Human-Social Science: A Arts & Humanities–Psychology 20, 6 (2020), 27–39.
- A Multimodal Framework for the Identification of Vaccine Critical Memes on Twitter. In Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining. 706–714.
- One Does Not Simply Produce Funny Memes! – Explorations on the Automatic Generation of Internet Humor. In Proceedings of the Seventh International Conference on Computational Creativity (ICCC 2016). Paris, France.
- OpenAI. 2023. OpenAI Gpt-4 Technical Report. Technical Report.
- Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems 35 (2022), 27730–27744.
- V. A. L. Peirson and E. M. Tolunay. 2018. Dank Learning: Generating Memes Using Deep Neural Networks. arXiv preprint arXiv:1806.04510 (2018).
- Detecting Harmful Memes and Their Targets. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021. Association for Computational Linguistics, Online, 2783–2796. https://doi.org/10.18653/v1/2021.findings-acl.246
- MOMENTA: A Multimodal Framework for Detecting Harmful Memes and Their Targets. In Findings of the Association for Computational Linguistics: EMNLP 2021.
- QuickMeme. 2016. Quick Meme Website. http://quickmeme.com/
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research 21, 1 (2020), 5485–5551.
- MemeBot: Towards automatic image meme generation. arXiv preprint arXiv:2004.14571 (2020).
- Multitask Prompted Training Enables Zero-Shot Task Generalization. In ICLR 2022-Tenth International Conference on Learning Representations.
- Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100 (2022).
- SemEval-2020 Task 8: Memotion Analysis-the Visuo-Lingual Metaphor!. In Proceedings of the Fourteenth Workshop on Semantic Evaluation. 759–773.
- Detecting and Understanding Harmful Memes: A Survey. The International Joint Conference on Artificial Intelligence (IJCAI) (2022).
- Multimodal and explainable internet meme classification. arXiv preprint arXiv:2212.05612 (2022).
- LLama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
- M. H. Van and X. Wu. 2023. Detecting and correcting hate speech in multimodal memes with large visual language model. The Conference and Workshop on Neural Information Processing Systems(NeurIPS) (2023).
- S. R. Vyalla and V. Udandarao. 2020. Memeify: A large-scale meme generation system. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD. 307–311.
- W. Y. Wang and M. Wen. 2015. I Can Has Cheezburger? A nonparanormal approach to combining textual and visual information for predicting and generating popular meme descriptions. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 355–365.
- Finetuned language models are zero-shot learners. arXiv preprint arXiv:2109.01652 (2021).
- MET-Meme: A multimodal meme dataset rich in metaphors. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2887–2899.
- mplug-owl: Modularization empowers large language models with multimodality. arXiv preprint arXiv:2304.14178 (2023).
- HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption. arXiv preprint (2023). arXiv:2310.01779
- B. Zhang and J. Pinto. 2021. Changing the World One Meme at a Time: The Effects of Climate Change Memes on Civic Engagement Intentions. Environmental Communication 15, 6 (2021), 749–764.
- Automatic chain of thought prompting in large language models. arXiv preprint arXiv:2210.03493 (Oct. 2022). arXiv:2210.03493 [cs.CL]
- Judging LLM-as-a-judge with MT-Bench and Chatbot Arena. arXiv preprint arXiv:2306.05685 (2023).
- Chatgpt asks, blip-2 answers: Automatic questioning towards enriched visual descriptions. (2023). arXiv:arXiv:2303.06594 [cs.CV]
- MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models. arXiv preprint arXiv:2304.10592 (2023).
- Multimodal zero-shot hateful meme detection. In Proceedings of the 14th ACM Web Science Conference 2022. 382–389.