Lion: Adversarial Distillation of Proprietary Large Language Models

Published 22 May 2023 in cs.CL | (2305.12870v2)

Abstract: The practice of transferring knowledge from a sophisticated, proprietary LLM to a compact, open-source LLM has garnered considerable attention. Previous works have focused on a unidirectional knowledge distillation way by aligning the responses of the student model with those of the teacher model to a set of instructions. Nevertheless, they overlooked the possibility of incorporating any reciprocal "feedback"--identifying challenging instructions where the student model's performance falls short--to boost the student model's proficiency iteratively. To this end, we propose a novel adversarial distillation framework for a more efficient knowledge transfer. Leveraging the versatile role adaptability of LLMs, we prompt the teacher model to identify "hard" instructions and generate new "hard" instructions for the student model, creating a three-stage adversarial loop of imitation, discrimination, and generation. By applying this adversarial framework, we successfully transfer knowledge from ChatGPT to a student model (named Lion), using a mere 70k training data. Our results show that Lion-13B not only achieves comparable open-ended generation capabilities to ChatGPT but surpasses conventional state-of-the-art (SOTA) instruction-tuned models like Vicuna-13B by 55.4% in challenging zero-shot reasoning benchmarks such as BIG-Bench Hard (BBH) and 16.7% on AGIEval. Code and model can be found at https://github.com/YJiangcm/Lion.

Abstract PDF HTML Upgrade to Chat

References (47)

Citations (15)

View on Semantic Scholar

Summary

The paper details an adversarial tri-stage framework where the student model Lion learns from ChatGPT by focusing on challenging instructions.
It employs iterative imitation, discrimination, and generation stages to provide targeted feedback and refine the student model’s responses.
Experimental results show that Lion nearly replicates ChatGPT’s capabilities, with improvements of 55.4% on BIG-Bench Hard and 16.7% on AGIEval.

Essay on "Lion: Adversarial Distillation of Proprietary LLMs"

The paper details an innovative approach to transferring knowledge from proprietary LLMs to a more efficient, open-source student model through adversarial distillation. The primary objective is to improve the student's proficiency by leveraging feedback loops that identify and focus on 'hard' instructions—those instances where the student model's performance diverges significantly from that of the teacher model.

Adversarial Distillation Framework

The novelty of the work lies in its adversarial tri-staged framework, designed to iteratively fine-tune a student model named Lion by incorporating "feedback" from its teacher model, ChatGPT. This process is structured as a positive feedback loop consisting of three stages:

Imitation Stage: The student model learns to align with the teacher's responses to a given set of instructions.
Discrimination Stage: The teacher model identifies 'hard' instructions where the student struggles, delivering feedback on performance discrepancies.
Generation Stage: Leveraging the adaptive capabilities of LLMs, the teacher generates new 'hard' instructional data, challenging the student further.

This iterative cycle is distinct from traditional unidirectional knowledge distillation techniques that ignore corrective feedback to tailor the learning to address the model's weak points.

Experimental Results

The results attest to the effectiveness of the Lion model. It comes remarkably close to replicating the open-ended generation capabilities of ChatGPT while outperforming other state-of-the-art models such as Vicuna-13B. On challenging zero-shot reasoning benchmarks like BIG-Bench Hard (BBH) and AGIEval, Lion-13B shows improvements of 55.4% and 16.7%, respectively, compared to its peers. These outcomes highlight the power of focusing on difficult instances to foster significant performance enhancements in AI models.

Implications and Future Directions

Practically, this research demonstrates a scalable mechanism to distill proprietary LLM's vast knowledge into a compact, open-source counterpart with fewer resources and training data. This advancement supports the creation of transparent, accessible AI systems without the overhead of high-cost computational iterations.

Theoretically, the integration of adversarial workflows could inspire similar methodologies across various machine-learning applications, extending beyond LLMs to areas like image and speech processing. Future advancements may involve enhancing role adaptability within proprietary models to exploit more complex task generation strategies and potentially apply reinforcement learning from human feedback (RLHF) to guide the refinement of student models further.

In conclusion, this paper delivers substantial insights into the closed-loop training dynamics in LLMs and provides a robust foundation for future research in efficient model distillation. The success of Lion in closely emulating sophisticated models like ChatGPT underscores the promise of adversarial distillation as a transformative approach in the AI community.

Markdown