Papers
Topics
Authors
Recent
Search
2000 character limit reached

Lion: Adversarial Distillation of Proprietary Large Language Models

Published 22 May 2023 in cs.CL | (2305.12870v2)

Abstract: The practice of transferring knowledge from a sophisticated, proprietary LLM to a compact, open-source LLM has garnered considerable attention. Previous works have focused on a unidirectional knowledge distillation way by aligning the responses of the student model with those of the teacher model to a set of instructions. Nevertheless, they overlooked the possibility of incorporating any reciprocal "feedback"--identifying challenging instructions where the student model's performance falls short--to boost the student model's proficiency iteratively. To this end, we propose a novel adversarial distillation framework for a more efficient knowledge transfer. Leveraging the versatile role adaptability of LLMs, we prompt the teacher model to identify "hard" instructions and generate new "hard" instructions for the student model, creating a three-stage adversarial loop of imitation, discrimination, and generation. By applying this adversarial framework, we successfully transfer knowledge from ChatGPT to a student model (named Lion), using a mere 70k training data. Our results show that Lion-13B not only achieves comparable open-ended generation capabilities to ChatGPT but surpasses conventional state-of-the-art (SOTA) instruction-tuned models like Vicuna-13B by 55.4% in challenging zero-shot reasoning benchmarks such as BIG-Bench Hard (BBH) and 16.7% on AGIEval. Code and model can be found at https://github.com/YJiangcm/Lion.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Degan: Data-enriching gan for retrieving representative samples from a trained classifier. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 3130–3137.
  2. Ext5: Towards extreme multi-task scaling for transfer learning. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
  3. A multitask, multilingual, multimodal evaluation of chatgpt on reasoning, hallucination, and interactivity. CoRR, abs/2302.04023.
  4. Sparks of artificial general intelligence: Early experiments with GPT-4. CoRR, abs/2303.12712.
  5. Chunkit Chan and Tsz Ho Chan. 2023. Discourse-aware prompt for argument impact classification. In Proceedings of the 15th International Conference on Machine Learning and Computing, ICMLC 2023, Zhuhai, China, February 17-20, 2023, pages 165–171. ACM.
  6. Chatgpt evaluation on sentence level relations: A focus on temporal, causal, and discourse relations. CoRR, abs/2304.14827.
  7. Self-consistent narrative prompts on abductive natural language inference. CoRR, abs/2309.08303.
  8. Discoprompt: Path prediction prompt tuning for implicit discourse relation recognition. In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9-14, 2023, pages 35–57. Association for Computational Linguistics.
  9. Data-free knowledge distillation for object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 3289–3298.
  10. Distilling knowledge learned in bert for text generation. arXiv preprint arXiv:1911.03829.
  11. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality.
  12. Up to 100x faster data-free knowledge distillation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 6597–6604.
  13. Data-free adversarial distillation. CoRR, abs/1912.11006.
  14. Google. 2023. Bard.
  15. The false promise of imitating proprietary llms. CoRR, abs/2305.15717.
  16. Knowledge distillation with adversarial samples supporting decision boundary. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pages 3771–3778. AAAI Press.
  17. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
  18. Improved universal sentence embeddings with prompt-based contrastive learning and energy-based learning. In Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 3021–3035. Association for Computational Linguistics.
  19. Maze: Data-free model stealing attack using zeroth-order gradient estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13814–13823.
  20. Chatgpt: Jack of all trades, master of none. CoRR, abs/2302.10724.
  21. Multi-step jailbreaking privacy attacks on chatgpt. CoRR, abs/2304.05197.
  22. Paul Micaelli and Amos J. Storkey. 2019a. Zero-shot knowledge transfer via adversarial belief matching. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vancouver, BC, Canada, pages 9547–9557.
  23. Paul Micaelli and Amos J Storkey. 2019b. Zero-shot knowledge transfer via adversarial belief matching. Advances in Neural Information Processing Systems, 32.
  24. Cross-task generalization via natural language crowdsourcing instructions. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 3470–3487. Association for Computational Linguistics.
  25. Orca: Progressive learning from complex explanation traces of GPT-4. CoRR, abs/2306.02707.
  26. OpenAI. 2023. GPT-4 technical report. CoRR, abs/2303.08774.
  27. TB OpenAI. 2022. Chatgpt: Optimizing language models for dialogue. OpenAI.
  28. Knockoff nets: Stealing functionality of black-box models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4954–4963.
  29. Training language models to follow instructions with human feedback. CoRR, abs/2203.02155.
  30. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pages 506–519.
  31. Data distillation: Towards omni-supervised learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4119–4128.
  32. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21:140:1–140:67.
  33. Partha Pratim Ray. 2023. Chatgpt: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope. Internet of Things and Cyber-Physical Systems.
  34. Multitask prompted training enables zero-shot task generalization. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
  35. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. arXiv preprint arXiv:2206.04615.
  36. Challenging big-bench tasks and whether chain-of-thought can solve them. CoRR, abs/2210.09261.
  37. Stanford alpaca: An instruction-following llama model. https://github.com/tatsu-lab/stanford_alpaca.
  38. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  39. Data-free model extraction. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4771–4780.
  40. Large language models are not fair evaluators. CoRR, abs/2305.17926.
  41. Self-instruct: Aligning language model with self generated instructions. CoRR, abs/2212.10560.
  42. Finetuned language models are zero-shot learners. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
  43. Emergent abilities of large language models. CoRR, abs/2206.07682.
  44. Chain-of-thought prompting elicits reasoning in large language models. In NeurIPS.
  45. Wizardlm: Empowering large language models to follow complex instructions.
  46. Dreaming to distill: Data-free knowledge transfer via deepinversion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8715–8724.
  47. Agieval: A human-centric benchmark for evaluating foundation models. CoRR, abs/2304.06364.
Citations (15)

Summary

  • The paper details an adversarial tri-stage framework where the student model Lion learns from ChatGPT by focusing on challenging instructions.
  • It employs iterative imitation, discrimination, and generation stages to provide targeted feedback and refine the student model’s responses.
  • Experimental results show that Lion nearly replicates ChatGPT’s capabilities, with improvements of 55.4% on BIG-Bench Hard and 16.7% on AGIEval.

Essay on "Lion: Adversarial Distillation of Proprietary LLMs"

The paper details an innovative approach to transferring knowledge from proprietary LLMs to a more efficient, open-source student model through adversarial distillation. The primary objective is to improve the student's proficiency by leveraging feedback loops that identify and focus on 'hard' instructions—those instances where the student model's performance diverges significantly from that of the teacher model.

Adversarial Distillation Framework

The novelty of the work lies in its adversarial tri-staged framework, designed to iteratively fine-tune a student model named Lion by incorporating "feedback" from its teacher model, ChatGPT. This process is structured as a positive feedback loop consisting of three stages:

  1. Imitation Stage: The student model learns to align with the teacher's responses to a given set of instructions.
  2. Discrimination Stage: The teacher model identifies 'hard' instructions where the student struggles, delivering feedback on performance discrepancies.
  3. Generation Stage: Leveraging the adaptive capabilities of LLMs, the teacher generates new 'hard' instructional data, challenging the student further.

This iterative cycle is distinct from traditional unidirectional knowledge distillation techniques that ignore corrective feedback to tailor the learning to address the model's weak points.

Experimental Results

The results attest to the effectiveness of the Lion model. It comes remarkably close to replicating the open-ended generation capabilities of ChatGPT while outperforming other state-of-the-art models such as Vicuna-13B. On challenging zero-shot reasoning benchmarks like BIG-Bench Hard (BBH) and AGIEval, Lion-13B shows improvements of 55.4% and 16.7%, respectively, compared to its peers. These outcomes highlight the power of focusing on difficult instances to foster significant performance enhancements in AI models.

Implications and Future Directions

Practically, this research demonstrates a scalable mechanism to distill proprietary LLM's vast knowledge into a compact, open-source counterpart with fewer resources and training data. This advancement supports the creation of transparent, accessible AI systems without the overhead of high-cost computational iterations.

Theoretically, the integration of adversarial workflows could inspire similar methodologies across various machine-learning applications, extending beyond LLMs to areas like image and speech processing. Future advancements may involve enhancing role adaptability within proprietary models to exploit more complex task generation strategies and potentially apply reinforcement learning from human feedback (RLHF) to guide the refinement of student models further.

In conclusion, this paper delivers substantial insights into the closed-loop training dynamics in LLMs and provides a robust foundation for future research in efficient model distillation. The success of Lion in closely emulating sophisticated models like ChatGPT underscores the promise of adversarial distillation as a transformative approach in the AI community.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Collections

Sign up for free to add this paper to one or more collections.