Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MAFALDA: A Benchmark and Comprehensive Study of Fallacy Detection and Classification (2311.09761v2)

Published 16 Nov 2023 in cs.CL, cs.AI, and cs.LG

Abstract: We introduce MAFALDA, a benchmark for fallacy classification that merges and unites previous fallacy datasets. It comes with a taxonomy that aligns, refines, and unifies existing classifications of fallacies. We further provide a manual annotation of a part of the dataset together with manual explanations for each annotation. We propose a new annotation scheme tailored for subjective NLP tasks, and a new evaluation method designed to handle subjectivity. We then evaluate several LLMs under a zero-shot learning setting and human performances on MAFALDA to assess their capability to detect and classify fallacies.

Citations (1)

Summary

We haven't generated a summary for this paper yet.