Preference-based LLM Distillation with Pseudo-Preference Pairs
Overview
The paper, PL: Preference-based LLM Distillation with Pseudo-Preference Pairs presents a novel approach for distilling knowledge from LLMs to smaller, more practical students without access to the LLM's internal states or requiring extensive computational resources. The proposed framework, called Preference-based LLM Distillation (PL), leverages pseudo-preference pairs to fine-tune the student model, guiding it based on the relative quality of generated outputs.
Key Contributions
- Novel Framework: PL introduces a novel framework for LLM distillation using preference data. The framework capitalizes on the differences in capacity between teacher and student models to create pseudo-preference pairs.
- Calibration Objective: An explicit calibration objective is used to align the student's sequence likelihood with output quality, circumventing the mis-calibration issue common in large models.
- Annotation-free Preference Pairs: The method constructs pseudo-preference pairs without human annotations, leveraging the inherent capacity gap between teacher and student models.
- Extensive Experiments: Comprehensive experiments across multiple tasks with various LLMs demonstrate the effectiveness and versatility of the PL framework.
Methodology
The PL framework begins by training both teacher and student models through supervised fine-tuning (SFT). The critical innovation lies in generating pseudo-preference pairs by comparing the teacher's and student's outputs on an unlabeled distillation set. The teacher's outputs are assumed to be of higher quality due to the model's larger capacity. The student then fine-tunes its estimation of sequence likelihood based on these pseudo-preference pairs using a ranking loss function that does not require access to the teacher's internal states.
Pseudo-Preference Pairs Generation
The process of generating pseudo-preference pairs involves sampling outputs from both teacher and student models for a given input. By assuming the teacher's output is inherently better, pairs are formed where the teacher's output is preferred. This approach avoids the costly requirement of human-annotated preference data.
Distillation with Preference Pairs
The distillation process uses two types of calibration loss functions:
- Ranking Calibration Loss: Encourages the student model to increase the relative likelihood of the preferred teacher's output over its own.
- Margin Calibration Loss: Further refines this process by incorporating a scoring function that adds a margin based on the quality of generated sequences.
Experimental Results
The effectiveness of the PL framework was validated through extensive experiments:
- Datasets: The framework was evaluated on the Anthropic-HH dialogue generation and Reddit TL;DR summarization tasks.
- Models: LLaMA-2 and GPT-Neo models served as the main LLM families, with additional experiments involving PaLM-2 and T5 models to demonstrate broad applicability.
- Metrics: Win rate and ROUGE scores were used as evaluation criteria. The student models distilled using PL consistently outperformed those trained with traditional KD methods in win rate, even matching or surpassing some teacher models.
Implications and Future Directions
The implications of this research are significant for deploying high-performance LLMs in resource-constrained environments. By addressing the student model's calibration and expressivity limitations through pseudo-preference pairs, the proposed method ensures effective distillation without requiring extensive computational overhead or human annotations.
Future research could explore:
- Iterative Distillation: Further improving the student model by iteratively refining pseudo-preference pairs based on intermediary student outputs.
- Enhanced Calibration Techniques: Developing more sophisticated calibration objectives and loss functions to refine the student's output quality.
- Broader Application Scenarios: Expanding the framework's applicability to a wider array of tasks and model architectures, ensuring robust performance across diverse NLP tasks.
Conclusion
The PL framework offers a resource-efficient and scalable solution for LLM distillation. By leveraging pseudo-preference pairs and an explicit calibration objective, it effectively bridges the gap between large teacher models and their smaller, more practical student counterparts. This approach provides a powerful tool for the deployment of high-performance language technologies in settings with limited computational resources.