When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes (2404.12365v1)

Published 18 Apr 2024 in cs.CL, cs.AI, cs.IR, and cs.LG

Abstract: We present FastFit, a method, and a Python package design to provide fast and accurate few-shot classification, especially for scenarios with many semantically similar classes. FastFit utilizes a novel approach integrating batch contrastive learning and token-level similarity score. Compared to existing few-shot learning packages, such as SetFit, Transformers, or few-shot prompting of LLMs via API calls, FastFit significantly improves multiclass classification performance in speed and accuracy across FewMany, our newly curated English benchmark, and Multilingual datasets. FastFit demonstrates a 3-20x improvement in training speed, completing training in just a few seconds. The FastFit package is now available on GitHub and PyPi, presenting a user-friendly solution for NLP practitioners.

References (31)

Summary

The paper presents FastFit, a few-shot text classification method that dramatically speeds up training (3–20x faster) while delivering superior accuracy.
It leverages batch contrastive training and token-level similarity metrics to effectively distinguish subtle differences among many classes.
FastFit offers easy integration with Hugging Face, is accessible via GitHub and PyPi, and demonstrates robust multilingual performance on diverse benchmarks.

Overview of FastFit: A Fast and Accurate Few-Shot Classification Method

Introduction

FastFit represents a notable step in addressing the challenge of few-shot classification, particularly when classes exhibit subtle semantic differences. By integrating techniques like batch contrastive learning and token-level similarity scoring, this method significantly outpaces its predecessors in both performance and efficiency.

Key Contributions

FastFit introduces several notable advancements in few-shot text classification:

Batch Contrastive Training: Utilizes intra-batch examples as negative samples, improving the separation of text classes efficiently.
Token-Level Similarity Metrics: Enhances the granularity of text comparison, leading to better discrimination between closely related classes.
Rapid Training Capabilities: Demonstrates a training speed 3-20x faster than comparative methods, enabling model readiness in seconds.

Technical Implementation

Implementation details highlight FastFit's accessibility and ease of integration:

Availability: The package is accessible via GitHub and PyPi, simplifying its adoption in practical applications.
Integration with Hugging Face: FastFit extends from the Hugging Face trainer, ensuring compatibility and customization according to user or task-specific requirements.

Evaluation on FewMany Benchmark

FastFit has been rigorously evaluated on the FewMany benchmark:

Consists of diverse datasets featuring at least 50 classes each, covering domains from intent detection to product classification.
Demonstrated superior performance in both English and multilingual settings, markedly outperforming other existing few-shot learning approaches.

Experimental Insights

Detailed experiments furnish extensive validation of FastFit's strengths:

Comparison with Baselines: FastFit shows heightened accuracy in multi-class few-shot scenarios compared to both standard classifier setups and other few-shot learning packages.
Efficiency in Training: Substantial improvements in training times were documented, providing substantial practical benefits in terms of efficiency.

Multilingual Capabilities

FastFit extends its applicability to multilingual datasets, an essential feature given the global use of AI solutions:

Evaluated using the Amazon Multilingual MASSIVE dataset, observing consistent performance across languages.

Versatility and Future Application

The flexibility of FastFit, demonstrated across different model sizes and in multilingual contexts, suggests its utility extends beyond its current applications. It portends well for future adaptations and optimizations that may further enhance its applicability and performance in an even broader array of few-shot and NLP tasks. Additionally, improvements in upstream processes like tokenization and pre-training could yield further gains.

Overall, FastFit presents a robust solution to the challenges of few-shot classification in NLP, combining speed, accuracy, and practicality. As AI continues to evolve, such tools will be pivotal in enabling efficient model training and deployment across various languages and domains.

PDF Markdown

Related Papers

Tweets

https://twitter.com/AsafYehudai/status/1781351105206571509

https://twitter.com/fly51fly/status/1782163143814508683

https://twitter.com/ElronBandel/status/1781151744896258276

https://twitter.com/DanielVarab/status/1795758396999680011

https://twitter.com/arxivsanitybot/status/1782037751627456533

https://twitter.com/ElronBandel/status/1781133089626054980