Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models (2403.00794v2)

Published 23 Feb 2024 in cs.CL, cs.AI, and cs.LG

Abstract: Humor is a fundamental facet of human cognition and interaction. Yet, despite recent advances in natural language processing, humor detection remains a challenging task that is complicated by the scarcity of datasets that pair humorous texts with similar non-humorous counterparts. In our work, we investigate whether LLMs, can generate synthetic data for humor detection via editing texts. We benchmark LLMs on an existing human dataset and show that current LLMs display an impressive ability to 'unfun' jokes, as judged by humans and as measured on the downstream task of humor detection. We extend our approach to a code-mixed English-Hindi humor dataset, where we find that GPT-4's synthetic data is highly rated by bilingual annotators and provides challenging adversarial examples for humor classifiers.

References (32)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/zachary_horvitz/status/1806002696383574212

Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models (2403.00794v2)

Summary

Related Papers

Tweets