Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MM-Claims: A Dataset for Multimodal Claim Detection in Social Media (2205.01989v1)

Published 4 May 2022 in cs.CL, cs.AI, cs.CV, cs.MM, and cs.SI

Abstract: In recent years, the problem of misinformation on the web has become widespread across languages, countries, and various social media platforms. Although there has been much work on automated fake news detection, the role of images and their variety are not well explored. In this paper, we investigate the roles of image and text at an earlier stage of the fake news detection pipeline, called claim detection. For this purpose, we introduce a novel dataset, MM-Claims, which consists of tweets and corresponding images over three topics: COVID-19, Climate Change and broadly Technology. The dataset contains roughly 86000 tweets, out of which 3400 are labeled manually by multiple annotators for the training and evaluation of multimodal models. We describe the dataset in detail, evaluate strong unimodal and multimodal baselines, and analyze the potential and drawbacks of current models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Gullal S. Cheema (8 papers)
  2. Sherzod Hakimov (37 papers)
  3. Abdul Sittar (11 papers)
  4. Eric Müller-Budack (19 papers)
  5. Christian Otto (12 papers)
  6. Ralph Ewerth (61 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.