Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
173 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EmMark: Robust Watermarks for IP Protection of Embedded Quantized Large Language Models (2402.17938v1)

Published 27 Feb 2024 in cs.CR and cs.CL

Abstract: This paper introduces EmMark,a novel watermarking framework for protecting the intellectual property (IP) of embedded LLMs deployed on resource-constrained edge devices. To address the IP theft risks posed by malicious end-users, EmMark enables proprietors to authenticate ownership by querying the watermarked model weights and matching the inserted signatures. EmMark's novelty lies in its strategic watermark weight parameters selection, nsuring robustness and maintaining model quality. Extensive proof-of-concept evaluations of models from OPT and LLaMA-2 families demonstrate EmMark's fidelity, achieving 100% success in watermark extraction with model performance preservation. EmMark also showcased its resilience against watermark removal and forging attacks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Franziska Boenisch. 2021. A systematic review on model watermarking for neural networks. Frontiers in big Data 4 (2021), 729663.
  2. Huili Chen et al. 2019. Deepmarks: A secure fingerprinting framework for digital rights management of deep learning models. In ICMR. 105–113.
  3. Huili Chen et al. 2020. SpecMark: A Spectral Watermarking Framework for IP Protection of Speech Recognition Systems.. In INTERSPEECH. 2312–2316.
  4. Bita Darvish Rouhani et al. 2019. Deepsigns: An end-to-end watermarking framework for ownership protection of deep neural networks. In ASPLOS.
  5. Tim Dettmers et al. 2022. Llm. int8 (): 8-bit matrix multiplication for transformers at scale. arXiv preprint arXiv:2208.07339 (2022).
  6. Tim Dettmers et al. 2023. Qlora: Efficient finetuning of quantized llms. arXiv preprint arXiv:2305.14314 (2023).
  7. Elias Frantar et al. 2022. Gptq: Accurate post-training quantization for generative pre-trained transformers. arXiv preprint arXiv:2210.17323 (2022).
  8. Leo Gao et al. 2021. A framework for few-shot language model evaluation. https://doi.org/10.5281/zenodo.5371628
  9. Mojan Javaheripi et al. 2022. LiteTransformerSearch: Training-free Neural Architecture Search for Efficient Language Models. NeurIPS 35 (2022), 24254–24267.
  10. Linyang Li et al. 2023. Watermarking LLMs with Weight Quantization. arXiv preprint arXiv:2310.11237 (2023).
  11. Yiming Li et al. 2022. Untargeted backdoor watermark: Towards harmless and stealthy dataset copyright protection. NeurIPS 35 (2022), 13238–13250.
  12. Ji Lin et al. 2023. AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration. arXiv preprint arXiv:2306.00978 (2023).
  13. Xinyin Ma et al. 2023. LLM-Pruner: On the Structural Pruning of Large Language Models. arXiv preprint arXiv:2305.11627 (2023).
  14. Stephen Merity et al. 2016. Pointer sentinel mixture models. arXiv preprint arXiv:1609.07843 (2016).
  15. Qualcomm. 2022. Whitepaper: The future of AI is hybrid. Qualcomm blog (2022).
  16. John Schulman et al. 2022. ChatGPT: Optimizing language models for dialogue. OpenAI blog (2022).
  17. Masoumeh Shafieinejad et al. 2021. On the robustness of backdoor-based watermarking in deep neural networks. In ACM IH&MMSec workshop. 177–188.
  18. David So et al. 2021. Searching for efficient transformers for language modeling. NeurIPS 34 (2021), 6010–6022.
  19. Mingjie Sun et al. 2023. A Simple and Effective Pruning Approach for Large Language Models. arXiv preprint arXiv:2306.11695 (2023).
  20. Rohan Taori et al. 2023. Stanford Alpaca: An Instruction-following LLaMA model. https://github.com/tatsu-lab/stanford_alpaca.
  21. Hugo Touvron et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
  22. Xiuying Wei et al. 2022. Outlier suppression: Pushing the limit of low-bit transformer language models. NeurIPS 35 (2022), 17402–17414.
  23. Thomas Wolf et al. 2019. Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019).
  24. Guangxuan Xiao et al. 2023. Smoothquant: Accurate and efficient post-training quantization for large language models. In ICML. 38087–38099.
  25. Runxin Xu et al. 2022. From dense to sparse: Contrastive pruning for better pre-trained language model compression. In AAAI, Vol. 36. 11547–11555.
  26. Zhewei Yao et al. 2023. A comprehensive study on post-training quantization for large language models. arXiv preprint arXiv:2303.08302 (2023).
  27. Susan Zhang et al. 2022. OPT: Open Pre-trained Transformer Language Models. arXiv:2205.01068
Citations (2)

Summary

We haven't generated a summary for this paper yet.