Self-Alignment with Instruction Backtranslation (2308.06259v3)

Published 11 Aug 2023 in cs.CL

Abstract: We present a scalable method to build a high quality instruction following LLM by automatically labelling human-written text with corresponding instructions. Our approach, named instruction backtranslation, starts with a LLM finetuned on a small amount of seed data, and a given web corpus. The seed model is used to construct training examples by generating instruction prompts for web documents (self-augmentation), and then selecting high quality examples from among these candidates (self-curation). This data is then used to finetune a stronger model. Finetuning LLaMa on two iterations of our approach yields a model that outperforms all other LLaMa-based models on the Alpaca leaderboard not relying on distillation data, demonstrating highly effective self-alignment.

PDF Abstract

Self-Alignment with Instruction Backtranslation

The paper "Self-Alignment with Instruction Backtranslation" proposes an innovative method for finetuning LLMs to improve their instruction-following capabilities. The core of the methodology is an iterative self-training algorithm termed instruction backtranslation, which leverages unlabelled data to create high-quality training datasets through a two-step process of self-augmentation and self-curation. This approach is inspired by backtranslation in machine translation.

Methodology

The approach hinges on two primary phases:

Self-Augmentation: Initially, a seed model, which has been finetuned on a small set of human-annotated (instruction, output) pairs, generates candidate instructions for a large collection of unlabelled web documents. Specifically, the seed model predicts instructions that are answered by segments of the web corpus.
Self-Curation: Subsequently, the same model is tasked with curating these generated (instruction, output) pairs by evaluating their quality. Only high-quality pairs are retained for further finetuning.

The process is iterated to improve the model's performance incrementally. This paper introduces Humpback, a model built through two iterations of instruction backtranslation using the LLaMa model as the base. The resultant model demonstrates superior performance compared to other non-distilled models on established benchmarks such as the Alpaca leaderboard.

Experimental Results

The transformed LLM, Humpback, is empirically validated against various baselines:

Data Quality vs. Data Quantity: The experiments reveal that data quality, assured via self-curation, is pivotal. Training with higher quality data subsets consistently enhances performance, even when the quantity of data is constrained.
Data Scaling Efficiency: The methodology efficiently scales with increasing data sizes while maintaining robust data quality, outperforming several models that employ knowledge distillation from more powerful models such as ChatGPT and GPT-4.
Model Quality on AlpacaEval: Notably, the Humpback 33B and 65B models set new benchmarks on the Alpaca leaderboard, surpassing non-distilled competitors like Guanaco and OASST across multiple evaluation metrics.

Evaluation

The model's instruction-following capability and general quality are assessed through both automated and human evaluations:

AlpacaEval: The model achieves high win rates compared to other models, including proprietary ones.
Human Evaluation: Pairwise comparisons indicate a preference for Humpback over other high-quality models such as LIMA and Claude.

Moreover, commonsense reasoning and massive multitask language understanding (MMLU) benchmarks show notable improvements, particularly in zero-shot settings, suggesting that Humpback has enhanced generalization capabilities.

Implications

The implications of this research span both practical and theoretical domains. Practically, the method enables the creation of high-quality instruction-following models without reliance on extensive human annotations or distillation from more powerful models, markedly reducing resource requirements. Theoretically, it underscores the efficacy of self-alignment in LLMs, potentially setting a new paradigm in model training.

Conclusion

This paper presents a compelling case for instruction backtranslation as a scalable method for finetuning LLMs, demonstrating substantial improvements in instruction-following performance. Future developments could explore scaling this method further by harnessing larger unlabeled corpora and integrating advanced curation strategies to meet diverse application requirements. This path may well drive the next leap in autonomous AI systems proficiency in understanding and executing complex instructions.