Generalizable Sarcasm Detection Is Just Around The Corner, Of Course! (2404.06357v2)
Abstract: We tested the robustness of sarcasm detection models by examining their behavior when fine-tuned on four sarcasm datasets containing varying characteristics of sarcasm: label source (authors vs. third-party), domain (social media/online vs. offline conversations/dialogues), style (aggressive vs. humorous mocking). We tested their prediction performance on the same dataset (intra-dataset) and across different datasets (cross-dataset). For intra-dataset predictions, models consistently performed better when fine-tuned with third-party labels rather than with author labels. For cross-dataset predictions, most models failed to generalize well to the other datasets, implying that one type of dataset cannot represent all sorts of sarcasm with different styles and domains. Compared to the existing datasets, models fine-tuned on the new dataset we release in this work showed the highest generalizability to other datasets. With a manual inspection of the datasets and post-hoc analysis, we attributed the difficulty in generalization to the fact that sarcasm actually comes in different domains and styles. We argue that future sarcasm research should take the broad scope of sarcasm into account.
- Internet argument corpus 2.0: An SQL schema for dialogic social media and the corpora to go with it. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 4445–4452, Portorož, Slovenia. European Language Resources Association (ELRA).
- SemEval-2022 task 6: iSarcasmEval, intended sarcasm detection in English and Arabic. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 802–814, Seattle, United States. Association for Computational Linguistics.
- Muhammad Abulaish and Ashraf Kamal. 2018. Self-Deprecating Sarcasm Detection: An Amalgamation of Rule-Based and Machine Learning Approach. In 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pages 574–579. IEEE. Event-place: Santiago.
- Modelling sarcasm in twitter, a novel approach. In proceedings of the 5th workshop on computational approaches to subjectivity, sentiment and social media analysis, pages 50–58.
- Alexandru-Costin Băroiu and \textcommabelowStefan Trău\textcommabelowsan-Matu. 2022. Automatic sarcasm detection: Systematic literature review. Information, 13(8):399.
- Andrea Bowes and Albert Katz. 2011. When Sarcasm Stings. Discourse Processes, 48(4):215–236.
- Penelope Brown and Stephen Levinson. 1978. Universals in language usage: Politeness phenomena. In Esther Goody, editor, Questions and Politeness: Strategies in Social Interaction, pages 56–310. Cambridge University Press, Cambridge.
- Multi-modal sarcasm detection in Twitter with hierarchical fusion model. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2506–2515, Florence, Italy. Association for Computational Linguistics.
- Towards Multimodal Sarcasm Detection (An _obviously_ Perfect Paper). In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4619–4629. Association for Computational Linguistics.
- FLUTE: Figurative language understanding through textual explanations. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 7139–7159, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Herbert L Colston. 1997. Salting a wound or sugaring a pill: The pragmatic functions of ironic criticism. Discourse processes, 23(1):25–45.
- Un paralleled sarcasm: a framework of parallel deep lstms with cross activation functions towards detection and generation of sarcastic statements. Language Resources and Evaluation, 57(2):765–802.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
- Why not say it directly? The social functions of irony. Discourse Processes, 19(3):347–367.
- Elena Filatova. 2012. Irony and sarcasm: Corpus generation and analysis using crowdsourcing. In Lrec, pages 392–398. Citeseer.
- How well do hate speech, toxicity, abusive and offensive language classification models generalize across datasets? Information Processing & Management, 58(3):102524.
- The sarchasm: Sarcasm production and identification in spontaneous conversation. Discourse Processes, 57(5-6):507–533.
- The unbearable hurtfulness of sarcasm. Expert Systems with Applications, 193:116398.
- Aniruddha Ghosh and Tony Veale. 2017. Magnets for Sarcasm: Making Sarcasm Detection Timely, Contextual and Very Personal. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 482–491. Association for Computational Linguistics. Event-place: Copenhagen, Denmark.
- Sarcasm Analysis Using Conversation Context. Computational Linguistics, 44(4):755–792.
- Raymond W. Gibbs. 2000. Irony in Talk Among Friends. Metaphor and Symbol, 15(1-2):5–27.
- Sam Glucksberg. 1995. Commentary on nonliteral language: Processing and use. Metaphor and Symbol, 10(1):47–57.
- Herbert P Grice. 1975. Logic and conversation. In Speech acts, pages 41–58. Brill.
- The unreasonable effectiveness of data. IEEE intelligent systems, 24(2):8–12.
- Debertav3: Improving deberta using electra-style pre-training with gradient-disentangled embedding sharing. CoRR, abs/2111.09543.
- Nikhil Jaiswal. 2020. Neural sarcasm detection using conversation context. In Proceedings of the second workshop on figurative language processing, pages 77–82.
- Intended and perceived sarcasm between close friends: What triggers sarcasm and what gets conveyed? In Proceedings of the Annual Meeting of the Cognitive Science Society, volume 45.
- Julia Jorgensen. 1996. The functions of sarcastic irony in speech. Journal of Pragmatics, 26(5):613–634.
- Automatic Sarcasm Detection: A Survey. ACM Computing Surveys, 50(5):1–22.
- Harnessing Context Incongruity for Sarcasm Detection. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pages 757–762. Association for Computational Linguistics. Event-place: Beijing, China.
- Thomas R Keenan and Kathleen Quigley. 1999. Do young children use echoic information in their comprehension of sarcastic speech? a test of echoic mention theory. British Journal of Developmental Psychology, 17(1):83–96.
- A Large Self-Annotated Corpus for Sarcasm. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), page 6. European Language Resources Association (ELRA).
- Roger J Kreuz and Sam Glucksberg. 1989. How to be sarcastic: The echoic reminder theory of verbal irony. Journal of experimental psychology: General, 118(4):374.
- John S. Leggitt and Raymond W. Gibbs. 2000. Emotional Reactions to Verbal Irony. Discourse Processes, 29(1):1–24.
- RoBERTa: A robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692.
- The Roles of Politeness and Humor in the Asymmetry of Affect in Verbal Irony. Discourse Processes, 41(1):3–24.
- Silviu Oprea and Walid Magdy. 2019. Exploring Author Context for Detecting Intended vs Perceived Sarcasm. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2854–2859, Florence, Italy. Association for Computational Linguistics.
- Silviu Oprea and Walid Magdy. 2020. iSarcasm: A Dataset of Intended Sarcasm. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online. Association for Computational Linguistics.
- Chandler: An explainable sarcastic response generator. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 339–349, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Creating and characterizing a diverse corpus of sarcasm in dialogue. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 31–41, Los Angeles. Association for Computational Linguistics.
- Utilizing weak supervision to create S3D: A sarcasm annotated dataset. In Proceedings of the Fifth Workshop on Natural Language Processing and Computational Social Science (NLP+CSS), pages 197–206, Abu Dhabi, UAE. Association for Computational Linguistics.
- Modeling intra and inter-modality incongruity for multi-modal sarcasm detection. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1383–1392, Online. Association for Computational Linguistics.
- The development and psychometric properties of LIWC2015. Technical report, University of Texas at Austin.
- Penny M. Pexman and Kara M. Olineck. 2002. Does Sarcasm Always Sting? Investigating the Impact of Ironic Insults and Ironic Compliments. Discourse Processes, 33(3):199–217.
- Sarcasm detection on czech and english twitter. In Proceedings of COLING 2014, the 25th international conference on computational linguistics: Technical papers, pages 213–223.
- A multimodal corpus for emotion recognition in sarcasm. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 6992–7003, Marseille, France. European Language Resources Association.
- Sarcasm as Contrast between a Positive Sentiment and Negative Situation. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 704–714. Association for Computational Linguistics.
- Dan Sperber and Deirdre Wilson. 1981. Irony and the use-mention distinction.
- Dan Sperber and Deirdre Wilson. 1986. Relevance: Communication and cognition, volume 142. Citeseer.
- A qualitative analysis of sarcasm, irony and related# hashtags on twitter. Big Data & Society, 7(2):2053951720972735.
- SemEval-2018 task 3: Irony detection in English tweets. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 39–50, New Orleans, Louisiana. Association for Computational Linguistics.
- A corpus for research on deliberation and debate. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), pages 812–817, Istanbul, Turkey. European Language Resources Association (ELRA).
- Jennifer Woodland and Daniel Voyer. 2011. Context and Intonation in the Perception of Sarcasm. Metaphor and Symbol, 26(3):227–239.
- Wenjie Yin and Arkaitz Zubiaga. 2021. Towards generalisable hate speech detection: a review on obstacles and solutions. PeerJ Computer Science, 7:e598.
- Cfn: A complex-valued fuzzy network for sarcasm detection in conversations. IEEE Transactions on Fuzzy Systems, 29(12):3696–3710.
- Hyewon Jang (2 papers)
- Diego Frassinelli (7 papers)