PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models (2204.01172v2)

Published 3 Apr 2022 in cs.CL

Abstract: Current methods for few-shot fine-tuning of pretrained masked LLMs (PLMs) require carefully engineered prompts and verbalizers for each new task to convert examples into a cloze-format that the PLM can score. In this work, we propose PERFECT, a simple and efficient method for few-shot fine-tuning of PLMs without relying on any such handcrafting, which is highly effective given as few as 32 data points. PERFECT makes two key design choices: First, we show that manually engineered task prompts can be replaced with task-specific adapters that enable sample-efficient fine-tuning and reduce memory and storage costs by roughly factors of 5 and 100, respectively. Second, instead of using handcrafted verbalizers, we learn new multi-token label embeddings during fine-tuning, which are not tied to the model vocabulary and which allow us to avoid complex auto-regressive decoding. These embeddings are not only learnable from limited data but also enable nearly 100x faster training and inference. Experiments on a wide range of few-shot NLP tasks demonstrate that PERFECT, while being simple and efficient, also outperforms existing state-of-the-art few-shot learning methods. Our code is publicly available at https://github.com/facebookresearch/perfect.git.

PDF Abstract

Summary of "Perfect: Prompt-free and Efficient Few-shot Learning with LLMs"

The paper "Perfect: Prompt-free and Efficient Few-shot Learning with LLMs" introduces a novel methodology, coined as "Perfect," for the enhancement of few-shot fine-tuning in pretrained masked LLMs (PLMs). This research stands out by eliminating the reliance on manually engineered prompts and verbalizers, common in existing few-shot learning methods, thus aiming to streamline and speed up the tuning process.

Key Contributions

Prompt-Free Learning: The paper proposes replacing manually crafted prompts with task-specific adapters, essential for efficient sample fine-tuning. This approach results in a significant reduction in memory and storage costs by factors of approximately 5 and 100, respectively.
Verbalizer-Free Approach: Instead of utilizing handcrafted verbalizers, Perfect employs a learned multi-token label embedding strategy during fine-tuning. These embeddings are detached from the PLM's vocabulary, allowing for the avoidance of complex auto-regressive decoding. The multivariate embeddings, learnable from limited data, facilitate training and inference speeds close to 100 times faster than traditional methods.
Empirical Evaluation: The effectiveness of Perfect is demonstrated through experiments on a wide array of few-shot NLP tasks. The results indicate that Perfect not only simplifies the learning process but also outperforms state-of-the-art few-shot methods.
Efficiency Improvements: The Perfect method significantly enhances efficiency across several metrics. It reduces training and inference times, memory usage, and storage requirements, making it a pragmatic choice for real-world applications with constrained resources.

Strong Numerical Results

Perfect achieves an average accuracy uplift compared to baseline methods, with notable improvements in both single-sentence and sentence-pair benchmarks. For single-sentence tasks, Perfect shows an increase of +1.1 in average accuracy, and for sentence-pair benchmarks, the gain is +4.6 when compared to PET-average. Moreover, Perfect demonstrates enhanced stability with reduced variance in performance across different tasks, thus showcasing robustness against the variability inherent in few-shot learning scenarios.

Practical and Theoretical Implications

The practical ramifications of this research are vast, offering a more resource-efficient, stable, and scalable approach to LLM fine-tuning. The reduction in computational resources aligns with the broader industry trend toward more sustainable AI practices. Theoretically, Perfect challenges the notion that manual prompt engineering is necessary for effective few-shot learning, paving the way for future research in automated and adaptive task encoding strategies.

Speculation on Future AI Developments

Future advancements could explore the integration of Perfect's methodologies into larger, more complex AI systems, potentially expanding its applicability across diverse, non-NLP AI challenges. Incorporating adaptive task specification methods like task-specific adapters and learned embeddings into the training pipeline could lead to the development of more universally adaptable AI models, fostering improved performance in dynamic environments and across tasks previously unseen during training.

In conclusion, Perfect presents a compelling case for shifting paradigms in few-shot learning, offering a framework that emphasizes simplicity and efficiency without compromising on accuracy. As the landscape of AI continues to evolve, approaches like Perfect are likely to influence emerging trends toward increased automation and reduced dependency on manual engineering in machine learning pipelines.

PDF Markdown Bookmark Chat (Pro)

Authors (7)

Rabeeh Karimi Mahabadi (9 papers)
Luke Zettlemoyer (225 papers)
James Henderson (52 papers)
Marzieh Saeidi (14 papers)
Lambert Mathias (19 papers)
Veselin Stoyanov (21 papers)
Majid Yazdani (10 papers)

Citations (67)

View on Semantic Scholar

Related Papers

Find Related Papers

GitHub

GitHub - facebookresearch/perfect: PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models (109 stars)