Papers
Topics
Authors
Recent
2000 character limit reached

SVIP: Towards Verifiable Inference of Open-source Large Language Models (2410.22307v2)

Published 29 Oct 2024 in cs.LG, cs.AI, cs.CL, and cs.CR

Abstract: The ever-increasing size of open-source LLMs renders local deployment impractical for individual users. Decentralized computing has emerged as a cost-effective solution, allowing individuals and small companies to perform LLM inference for users using surplus computational power. However, a computing provider may stealthily substitute the requested LLM with a smaller, less capable model without consent from users, thereby benefiting from cost savings. We introduce SVIP, a secret-based verifiable LLM inference protocol. Unlike existing solutions based on cryptographic or game-theoretic techniques, our method is computationally effective and does not rest on strong assumptions. Our protocol requires the computing provider to return both the generated text and processed hidden representations from LLMs. We then train a proxy task on these representations, effectively transforming them into a unique model identifier. With our protocol, users can reliably verify whether the computing provider is acting honestly. A carefully integrated secret mechanism further strengthens its security. We thoroughly analyze our protocol under multiple strong and adaptive adversarial scenarios. Our extensive experiments demonstrate that SVIP is accurate, generalizable, computationally efficient, and resistant to various attacks. Notably, SVIP achieves false negative rates below 5% and false positive rates below 3%, while requiring less than 0.01 seconds per prompt query for verification.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  2. Falcon-40B: an open large language model with state-of-the-art performance. 2023.
  3. Gpt-neox-20b: An open-source autoregressive language model. arXiv preprint arXiv:2204.06745, 2022.
  4. Verifiable privacy-preserving computing. arXiv preprint arXiv:2309.08248, 2023.
  5. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality. See https://vicuna. lmsys. org (accessed 14 April 2023), 2(3):6, 2023.
  6. Undetectable watermarks for language models. In The Thirty Seventh Annual Conference on Learning Theory, pp.  1125–1139. PMLR, 2024.
  7. Verifying computations with streaming interactive proofs. arXiv preprint arXiv:1109.6882, 2011.
  8. Geppetto: Versatile verifiable computation. In 2015 IEEE Symposium on Security and Privacy, pp.  253–270. IEEE, 2015.
  9. Zero-knowledge proofs of knowledge without interaction. In Proceedings., 33rd Annual Symposium on Foundations of Computer Science, pp.  427–436. IEEE Computer Society, 1992.
  10. The llama 3 herd of models. arXiv preprint arXiv:2407.21783, 2024.
  11. Zero knowledge proofs of identity. In Proceedings of the nineteenth annual ACM symposium on Theory of computing, pp.  210–217, 1987.
  12. Boosting verifiable computation on encrypted data. In Public-Key Cryptography–PKC 2020: 23rd IACR International Conference on Practice and Theory of Public-Key Cryptography, Edinburgh, UK, May 4–7, 2020, Proceedings, Part II 23, pp.  124–154. Springer, 2020.
  13. The pile: An 800gb dataset of diverse text for language modeling. arXiv preprint arXiv:2101.00027, 2020.
  14. Non-interactive verifiable computing: Outsourcing computation to untrusted workers. In Advances in Cryptology–CRYPTO 2010: 30th Annual Cryptology Conference, Santa Barbara, CA, USA, August 15-19, 2010. Proceedings 30, pp.  465–482. Springer, 2010.
  15. Safetynets: Verifiable execution of deep neural networks on an untrusted cloud. Advances in Neural Information Processing Systems, 30, 2017.
  16. Delegating computation: interactive proofs for muggles. Journal of the ACM (JACM), 62(4):1–64, 2015.
  17. On the learnability of watermarks for language models. arXiv preprint arXiv:2312.04469, 2023.
  18. Achieving privacy-preserving and verifiable support vector machine training in the cloud. IEEE Transactions on Information Forensics and Security, 18:3476–3491, 2023a.
  19. Unbiased watermark for large language models. arXiv preprint arXiv:2310.10669, 2023b.
  20. Hugging Face. Llama 3.1 - 405b, 70b & 8b with multilinguality and long context. https://huggingface.co/blog/llama31#inference-memory-requirements, 2024.
  21. Mistral 7b. arXiv preprint arXiv:2310.06825, 2023.
  22. Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
  23. Diederik P Kingma. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  24. A watermark for large language models. In International Conference on Machine Learning, pp.  17061–17084. PMLR, 2023.
  25. xjsnark: A framework for efficient verifiable computation. In 2018 IEEE Symposium on Security and Privacy (SP), pp.  944–961. IEEE, 2018.
  26. A literature survey on open source large language models. In Proceedings of the 2024 7th International Conference on Computers in Management and Business, pp.  133–143, 2024.
  27. Natural questions: a benchmark for question answering research. Transactions of the Association of Computational Linguistics, 2019.
  28. Verifiable computation in multiparty protocols with honest majority. In Provable Security: 8th International Conference, ProvSec 2014, Hong Kong, China, October 9-10, 2014. Proceedings 8, pp.  146–161. Springer, 2014.
  29. Bloom: A 176b-parameter open-access multilingual language model. 2023.
  30. Privacy-preserving machine learning with fully homomorphic encryption for deep neural network. iEEE Access, 10:30039–30054, 2022.
  31. vcnn: Verifiable convolutional neural network based on zk-snarks. IEEE Transactions on Dependable and Secure Computing, 2024.
  32. Prefix-tuning: Optimizing continuous prompts for generation. arXiv preprint arXiv:2101.00190, 2021.
  33. Toxicchat: Unveiling hidden challenges of toxicity detection in real-world user-ai conversation, 2023.
  34. A new game theoretic scheme for verifiable cloud computing. In 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC), pp.  1–8. IEEE, 2018.
  35. Computing neural networks with homomorphic encryption and verifiable computing. In Applied Cryptography and Network Security Workshops: ACNS 2020 Satellite Workshops, AIBlock, AIHWS, AIoTS, Cloud S&P, SCI, SecMT, and SiMLA, Rome, Italy, October 19–22, 2020, Proceedings 18, pp.  295–317. Springer, 2020.
  36. Game-theoretic analysis of an incentivized verifiable computation system. In Financial Cryptography and Data Security: FC 2019 International Workshops, VOTING and WTSC, St. Kitts, St. Kitts and Nevis, February 18–22, 2019, Revised Selected Papers 23, pp.  50–66. Springer, 2020.
  37. Toward verifiable and privacy preserving machine learning prediction. IEEE Transactions on Dependable and Secure Computing, 19(3):1703–1721, 2020.
  38. Pinocchio: Nearly practical verifiable computation. Communications of the ACM, 59(2):103–112, 2016.
  39. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  40. N Reimers. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084, 2019.
  41. Taking {{\{{Proof-Based}}\}} verified computation a few steps closer to practicality. In 21st USENIX Security Symposium (USENIX Security 12), pp.  253–268, 2012.
  42. Verifiable computing applications in blockchain. IEEE access, 9:156729–156745, 2021.
  43. zkllm: Zero knowledge proofs for large language models. arXiv preprint arXiv:2404.16109, 2024.
  44. A survey on zero-knowledge proof in blockchain. IEEE network, 35(4):198–205, 2021.
  45. Justin Thaler. Time-optimal interactive proofs for circuit evaluation. In Annual Cryptology Conference, pp.  71–89. Springer, 2013.
  46. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971, 2023a.
  47. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023b.
  48. Blockchain-based decentralized cloud/fog solutions: Challenges, opportunities, and standards. IEEE Communications Standards Magazine, 2(3):22–28, 2018.
  49. Verifying computations without reexecuting them. Communications of the ACM, 58(2):74–84, 2015.
  50. GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model. https://github.com/kingoflolz/mesh-transformer-jax, May 2021.
  51. Instructional fingerprinting of large language models. arXiv preprint arXiv:2401.12255, 2024.
  52. Qwen2 technical report. arXiv preprint arXiv:2407.10671, 2024.
  53. A zero-knowledge-proof-based digital identity management scheme in blockchain. Computers & Security, 99:102050, 2020.
  54. A survey of verifiable computation. Mobile Networks and Applications, 22:438–453, 2017.
  55. Opt: Open pre-trained transformer language models, 2022.
  56. Opt: Open pre-trained transformer language models, 2022. URL https://arxiv. org/abs/2205.01068, 3:19–0, 2023.
  57. Proof of sampling: A nash equilibrium-secured verification protocol for decentralized systems. arXiv preprint arXiv:2405.00295, 2024.
  58. Veriml: Enabling integrity assurances and fair payments for machine learning as a service. IEEE Transactions on Parallel and Distributed Systems, 32(10):2524–2540, 2021.
  59. Lmsys-chat-1m: A large-scale real-world llm conversation dataset, 2023a.
  60. Judging llm-as-a-judge with mt-bench and chatbot arena. Advances in Neural Information Processing Systems, 36:46595–46623, 2023b.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 4 tweets with 2 likes about this paper.