Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Classification Performance With Human Feedback: Label a few, we label the rest (2401.09555v1)

Published 17 Jan 2024 in cs.LG, cs.AI, and cs.CL

Abstract: In the realm of artificial intelligence, where a vast majority of data is unstructured, obtaining substantial amounts of labeled data to train supervised machine learning models poses a significant challenge. To address this, we delve into few-shot and active learning, where are goal is to improve AI models with human feedback on a few labeled examples. This paper focuses on understanding how a continuous feedback loop can refine models, thereby enhancing their accuracy, recall, and precision through incremental human input. By employing LLMs such as GPT-3.5, BERT, and SetFit, we aim to analyze the efficacy of using a limited number of labeled examples to substantially improve model accuracy. We benchmark this approach on the Financial Phrasebank, Banking, Craigslist, Trec, Amazon Reviews datasets to prove that with just a few labeled examples, we are able to surpass the accuracy of zero shot LLMs to provide enhanced text classification performance. We demonstrate that rather than needing to manually label millions of rows of data, we just need to label a few and the model can effectively predict the rest.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Natan Vidra (4 papers)
  2. Thomas Clifford (1 paper)
  3. Katherine Jijo (3 papers)
  4. Eden Chung (3 papers)
  5. Liang Zhang (357 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets