Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
86 tokens/sec
Gemini 2.5 Pro Premium
43 tokens/sec
GPT-5 Medium
19 tokens/sec
GPT-5 High Premium
30 tokens/sec
GPT-4o
93 tokens/sec
DeepSeek R1 via Azure Premium
88 tokens/sec
GPT OSS 120B via Groq Premium
441 tokens/sec
Kimi K2 via Groq Premium
234 tokens/sec
2000 character limit reached

Purifying Large Language Models by Ensembling a Small Language Model (2402.14845v1)

Published 19 Feb 2024 in cs.CL, cs.AI, and cs.LG

Abstract: The emerging success of LLMs heavily relies on collecting abundant training data from external (untrusted) sources. Despite substantial efforts devoted to data cleaning and curation, well-constructed LLMs have been reported to suffer from copyright infringement, data poisoning, and/or privacy violations, which would impede practical deployment of LLMs. In this study, we propose a simple and easily implementable method for purifying LLMs from the negative effects caused by uncurated data, namely, through ensembling LLMs with benign and small LLMs (SLMs). Aside from theoretical guarantees, we perform comprehensive experiments to empirically confirm the efficacy of ensembling LLMs with SLMs, which can effectively preserve the performance of LLMs while mitigating issues such as copyright infringement, data poisoning, and privacy violations.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Gpt-4 technical report.
  2. A framework for the evaluation of code generation models. https://github.com/bigcode-project/bigcode-evaluation-harness.
  3. Pythia: A suite for analyzing large language models across training and scaling. In Proceedings of the 40th International Conference on Machine Learning, ICML’23. JMLR.org.
  4. Piqa: Reasoning about physical commonsense in natural language. In Thirty-Fourth AAAI Conference on Artificial Intelligence.
  5. Blake Brittain. 2024. Microsoft, openai hit with new lawsuit. ITnews Asia.
  6. Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
  7. Differentially private optimization on large model at small cost. In International Conference on Machine Learning, pages 3192–3218. PMLR.
  8. Federico Cassano. 2023. https://github.com/cassanof/finetuning-harness.
  9. Badpre: Task-agnostic backdoor attacks to pre-trained nlp foundation models. arXiv preprint arXiv:2110.02467.
  10. Evaluating large language models trained on code.
  11. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 24(240):1–113.
  12. Anisa Noorassa Christina L. Martini, Jodi Benassi. 2022. 2022 ip outlook report: The developments shaping copyright law. The National Law Review.
  13. Think you have solved question answering? try arc, the ai2 reasoning challenge. arXiv:1803.05457v1.
  14. A survey on ensemble learning. Frontiers of Computer Science, 14:241–258.
  15. Ronen Eldan and Mark Russinovich. 2023. Who’s harry potter? approximate unlearning in llms. arXiv preprint arXiv:2310.02238.
  16. Christopher V. Carani et al. 2023. Copyright Laws and Regulations USA 2024. https://iclg.com/practice-areas/copyright-laws-and-regulations/usa.
  17. A framework for few-shot language model evaluation.
  18. Sleeper agents: Training deceptive llms that persist through safety training.
  19. Usama Jawad. 2022. Class-action lawsuit filed against microsoft’s github copilot for software piracy. Neowin.
  20. Llm-blender: Ensembling large language models with pairwise ranking and generative fusion.
  21. Scaling laws for neural language models.
  22. Propile: Probing privacy leakage in large language models. arXiv preprint arXiv:2307.01881.
  23. The stack: 3 tb of permissively licensed source code. Preprint.
  24. Fast inference from transformers via speculative decoding. In International Conference on Machine Learning, pages 19274–19286. PMLR.
  25. Starcoder: may the source be with you!
  26. Logiqa 2.0—an improved dataset for logical reasoning in natural language understanding. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 31:2947–2962.
  27. Routing to the expert: Efficient reward-guided ensemble of large language models.
  28. Dolos 2.0: Towards seamless source code plagiarism detection in online learning environments. In Proceedings of the 2023 Conference on Innovation and Technology in Computer Science Education V. 2, ITiCSE 2023, page 632, New York, NY, USA. Association for Computing Machinery.
  29. Ryan Mac Michael M. Grynbaum. 2023. The times sues openai and microsoft over a.i. use of copyrighted work. The New York Times.
  30. Silo language models: Isolating legal risk in a nonparametric datastore. arXiv preprint arXiv:2308.04430.
  31. Understanding source code evolution using abstract syntax tree matching. In Proceedings of the 2005 international workshop on Mining software repositories, pages 1–5.
  32. CodexLeaks: Privacy leaks from code generation language models in GitHub copilot. In 32nd USENIX Security Symposium (USENIX Security 23), pages 2133–2150, Anaheim, CA. USENIX Association.
  33. The LAMBADA dataset: Word prediction requiring a broad discourse context. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1525–1534, Berlin, Germany. Association for Computational Linguistics.
  34. Robi Polikar. 2012. Ensemble learning. Ensemble machine learning: Methods and applications, pages 1–34.
  35. Training text-to-text transformers with privacy guarantees. Findings of the Association for Computational Linguistics: ACL 2022, pages 2182–2193.
  36. Onion: A simple and effective defense against textual backdoor attacks. arXiv preprint arXiv:2011.10369.
  37. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9.
  38. Code llama: Open foundation models for code.
  39. Winogrande: An adversarial winograd schema challenge at scale.
  40. Pamela Samuelson. 2023. Generative ai meets copyright. Science, 381(6654):158–161.
  41. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100.
  42. You autocomplete me: Poisoning vulnerabilities in neural code completion. In 30th USENIX Security Symposium (USENIX Security 21), pages 1559–1575.
  43. One size does not fit all: Investigating strategies for differentially-private learning across nlp tasks. arXiv preprint arXiv:2112.08159.
  44. Lamda: Language models for dialog applications. arXiv preprint arXiv:2201.08239.
  45. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971.
  46. Provable copyright protection for generative models. arXiv preprint arXiv:2302.10870.
  47. Knowledge fusion of large language models.
  48. Crowdsourcing multiple choice science questions. ArXiv, abs/1707.06209.
  49. Detecting ai trojans using meta neural analysis.
  50. A survey on ensemble learning under the era of deep learning. Artificial Intelligence Review, 56(6):5545–5589.
  51. Bag of tricks for training data extraction from language models. arXiv preprint arXiv:2302.04460.
  52. Codeipprompt: Intellectual property infringement assessment of code language models. In Proceedings of the 40th International Conference on Machine Learning, pages 40373–40389.
  53. A recipe for watermarking diffusion models.
Citations (9)

Summary

  • The paper introduces an ensemble method that fuses LLMs and SLMs to purify models from uncurated data issues without altering their parameters.
  • It leverages output probability blending via the CP-ΔΔ algorithm to significantly reduce copyright infringements, data poisoning, and privacy leaks.
  • The study demonstrates a minimal performance trade-off and flexible integration with other enhancement techniques for real-world LLM applications.

Ensembling Large and Small LLMs for Mitigating Uncurated Data Effects

Introduction

The development and deployment of LLMs have undergone substantial progress, predominantly fueled by extensive web-collected training datasets. However, these datasets often include uncurated content leading to legal, ethical, and privacy concerns such as copyright infringement, data poisoning, and personal identifiable information (PII) leakage. Traditional data curation methods are labor-intensive and thus, not entirely effective in mitigating these issues. In this context, the proposed paper introduces a simpler, more feasible alternative—a model ensemble strategy that combines the capabilities of an untrusted LLM with a benign Small LLM (SLM) to purify the LLM from the adverse impacts of uncurated data.

The Ensemble Strategy

The core proposition of this paper is the ensemble method, which melds the output probabilities of both an LLM and an SLM. This blend aims to leverage the high performance of LLMs while utilizing the benign, well-curated nature of SLMs to counteract the negative effects of untrusted data in LLMs. The ensemble operates at the logits level, implying a plug-and-play approach that does not necessitate alterations to the original models' parameters. It leverages the CP-ΔΔ algorithm under the premise that both models fulfill a "sharded-safe" function, which is crucial for implementing copyright protection measures. The method's flexibility is evident in its capacity to adjust ensemble weights dynamically, offering a spectrum of models with varying degrees of purification and performance.

Evaluation and Findings

The evaluation of this ensemble strategy across nine different LLMs on ten benchmark datasets demonstrates its efficacy in significantly reducing copyright infringements, data poisonings, and privacy violations. Notably, the method shows a minimal trade-off between purification of the model and retention of standard performance. This indicates the ensemble's potential as an efficient and flexible solution to align with changing standards and regulations, especially in the copyright domain. Furthermore, experiments highlight that the ensemble method can be seamlessly integrated with other model enhancement strategies, showcasing its broad applicability.

Practical Implications and Future Directions

From a practical standpoint, this ensemble approach presents a promising avenue for mitigating the risks associated with uncurated data in LLMs without compromising their performance. Its ability to dynamically adjust ensemble weights opens doors to real-time model optimization in response to varying operational requirements and regulations. Moreover, the paper hints at the potential for further exploration into the synergistic integration of this ensemble strategy with other model enhancement and acceleration techniques, potentially broadening its utility beyond data purification alone.

Conclusion

This research delineates a novel and effective method for purifying LLMs from the repercussions of uncurated data through a strategic ensemble with SLMs. The strategy not only preserves the LLMs' performance but also mitigates legal and ethical risks, representing a significant stride towards the responsible use of LLMs. Given the ever-increasing reliance on LLMs across various domains, such a method holds considerable promise for enhancing the utility and acceptability of LLM applications in real-world scenarios, setting a precedent for future research in this critical area of AI ethics and governance.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.