Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient Few-Shot Learning Without Prompts (2209.11055v1)

Published 22 Sep 2022 in cs.CL

Abstract: Recent few-shot methods, such as parameter-efficient fine-tuning (PEFT) and pattern exploiting training (PET), have achieved impressive results in label-scarce settings. However, they are difficult to employ since they are subject to high variability from manually crafted prompts, and typically require billion-parameter LLMs to achieve high accuracy. To address these shortcomings, we propose SetFit (Sentence Transformer Fine-tuning), an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers (ST). SetFit works by first fine-tuning a pretrained ST on a small number of text pairs, in a contrastive Siamese manner. The resulting model is then used to generate rich text embeddings, which are used to train a classification head. This simple framework requires no prompts or verbalizers, and achieves high accuracy with orders of magnitude less parameters than existing techniques. Our experiments show that SetFit obtains comparable results with PEFT and PET techniques, while being an order of magnitude faster to train. We also show that SetFit can be applied in multilingual settings by simply switching the ST body. Our code is available at https://github.com/huggingface/setfit and our datasets at https://huggingface.co/setfit .

Citations (158)

Summary

  • The paper introduces SetFit, a method that bypasses manual prompt-engineering by fine-tuning sentence transformers with a contrastive approach.
  • The paper demonstrates that SetFit achieves competitive accuracy using as few as eight examples while significantly reducing model size and training time.
  • The paper evidences robust performance in multilingual settings, making NLP applications more accessible and cost-effective.

Efficient Few-Shot Learning Without Prompts: An Overview

Introduction

This paper introduces SetFit (Sentence Transformer Fine-tuning), a pioneering approach aimed at refining the efficiency and practicality of few-shot learning in NLP. SetFit addresses the limitations of existing approaches, such as PEFT and PET, which often depend on billion-parameter models and manually crafted prompts, leading to high variability and resource demands. SetFit circumvents the prompt dependency and substantially reduces the computational footprint, making it more accessible to researchers and practitioners.

Methodology

SetFit employs a two-step process utilizing Sentence Transformers (ST). Initially, the pretrained ST is fine-tuned using a contrastive Siamese method on a small set of text pairs. This step enables the derivation of rich text embeddings, from which a classification head is trained. The framework eschews the need for prompts or verbalizers, streamlining the training process and yielding a more straightforward application in multilingual contexts.

Experimental Evaluation

The paper rigorously evaluates SetFit across several standard NLP datasets, comparing its performance against prominent few-shot techniques like standard PLM fine-tuning, Adapet, Perfect, and T-Few. SetFit consistently matches or surpasses these methods, excelling in computational efficiency.

For instance, with only eight labeled examples in the Customer Reviews sentiment dataset, SetFit not only achieves competitive accuracy comparable to full-set fine-tuning but also demonstrates more stable performance, addressing common few-shot instability issues. Moreover, SetFit exhibits robustness in multilingual settings, underscoring its applicability in diverse linguistic environments without necessitating extensive computational resources.

Numerical Results and Computational Efficiency

SetFit achieves notable performance benchmarks, comparable to state-of-the-art methods, while requiring significantly fewer model parameters and considerably less training time. For example, on the RAFT benchmark, SetFit outperforms several methods, including GPT-3 and PET, while maintaining far lower computational expenses. The efficiency gains are evident in production scenarios where inference and training costs are critical considerations.

Implications and Future Directions

SetFit's success suggests several implications for both practical and theoretical development in AI. Practically, the approach could democratize access to sophisticated NLP capabilities, enabling more researchers and industries to leverage these technologies without prohibitive resource demands. Theoretically, SetFit stimulates further exploration into optimizing transformer architectures for efficiency, especially in low-resource and multilingual contexts.

Future research may focus on expanding SetFit's applicability to other domains, such as cross-lingual transfer and domain adaptation, while also investigating the potential of integrating additional techniques for further reducing the required labeled samples.

Conclusion

The introduction of SetFit marks a substantial step toward more efficient, accessible few-shot learning methodologies. By removing dependencies on large models and manual prompt-engineering, SetFit provides a practical alternative that can seamlessly integrate into diverse NLP workflows. As AI systems continue to expand, approaches like SetFit pave the way for more inclusive and sustainable development practices.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com