Detecting Statements in Text: A Domain-Agnostic Few-Shot Solution (2405.05705v1)

Published 9 May 2024 in cs.CL

Abstract: Many tasks related to Computational Social Science and Web Content Analysis involve classifying pieces of text based on the claims they contain. State-of-the-art approaches usually involve fine-tuning models on large annotated datasets, which are costly to produce. In light of this, we propose and release a qualitative and versatile few-shot learning methodology as a common paradigm for any claim-based textual classification task. This methodology involves defining the classes as arbitrarily sophisticated taxonomies of claims, and using Natural Language Inference models to obtain the textual entailment between these and a corpus of interest. The performance of these models is then boosted by annotating a minimal sample of data points, dynamically sampled using the well-established statistical heuristic of Probabilistic Bisection. We illustrate this methodology in the context of three tasks: climate change contrarianism detection, topic/stance classification and depression-relates symptoms detection. This approach rivals traditional pre-train/fine-tune approaches while drastically reducing the need for data annotation.

References (34)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/realmofresearch/status/1789333476484739461

Detecting Statements in Text: A Domain-Agnostic Few-Shot Solution (2405.05705v1)

Summary

Related Papers

Tweets