Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System (2307.07135v1)

Published 14 Jul 2023 in cs.CL

Abstract: Multi-modal sarcasm detection has attracted much recent attention. Nevertheless, the existing benchmark (MMSD) has some shortcomings that hinder the development of reliable multi-modal sarcasm detection system: (1) There are some spurious cues in MMSD, leading to the model bias learning; (2) The negative samples in MMSD are not always reasonable. To solve the aforementioned issues, we introduce MMSD2.0, a correction dataset that fixes the shortcomings of MMSD, by removing the spurious cues and re-annotating the unreasonable samples. Meanwhile, we present a novel framework called multi-view CLIP that is capable of leveraging multi-grained cues from multiple perspectives (i.e., text, image, and text-image interaction view) for multi-modal sarcasm detection. Extensive experiments show that MMSD2.0 is a valuable benchmark for building reliable multi-modal sarcasm detection systems and multi-view CLIP can significantly outperform the previous best baselines.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Libo Qin (77 papers)
  2. Shijue Huang (14 papers)
  3. Qiguang Chen (44 papers)
  4. Chenran Cai (1 paper)
  5. Yudi Zhang (19 papers)
  6. Bin Liang (115 papers)
  7. Wanxiang Che (152 papers)
  8. Ruifeng Xu (66 papers)
Citations (21)

Summary

We haven't generated a summary for this paper yet.