Binary Linear Tree Commitment-based Ownership Protection for Distributed Machine Learning (2401.05895v1)
Abstract: Distributed machine learning enables parallel training of extensive datasets by delegating computing tasks across multiple workers. Despite the cost reduction benefits of distributed machine learning, the dissemination of final model weights often leads to potential conflicts over model ownership as workers struggle to substantiate their involvement in the training computation. To address the above ownership issues and prevent accidental failures and malicious attacks, verifying the computational integrity and effectiveness of workers becomes particularly crucial in distributed machine learning. In this paper, we proposed a novel binary linear tree commitment-based ownership protection model to ensure computational integrity with limited overhead and concise proof. Due to the frequent updates of parameters during training, our commitment scheme introduces a maintainable tree structure to reduce the costs of updating proofs. Distinguished from SNARK-based verifiable computation, our model achieves efficient proof aggregation by leveraging inner product arguments. Furthermore, proofs of model weights are watermarked by worker identity keys to prevent commitments from being forged or duplicated. The performance analysis and comparison with SNARK-based hash commitments validate the efficacy of our model in preserving computational integrity within distributed machine learning.
- Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In USENIX Security 18. 1615–1631.
- Lattice-based SNARKs: Publicly verifiable, preprocessing, and recursively composable. In Annual International Cryptology Conference. Springer, 102–132.
- Succinct Non-Interactive zero knowledge for a von neumann architecture. In USENIX Security 14. 781–796.
- Xiaoyu Cao and Neil Zhenqiang Gong. 2022. Mpaf: Model poisoning attacks to federated learning based on fake clients. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3396–3404.
- IPGuard: Protecting intellectual property of deep neural networks via fingerprinting the classification boundary. In Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security. 14–25.
- SSLGuard: A watermarking scheme for self-supervised learning pre-trained encoders. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security. 579–593.
- RAI2: Responsible Identity Audit Governing the Artificial Intelligence. In NDSS.
- Drynx: Decentralized, Secure, Verifiable System for Statistical Queries and Machine Learning on Distributed Datasets. IEEE Transactions on Information Forensics and Security 15 (2020), 3035–3050.
- Pointproofs: Aggregating proofs for multiple vector commitments. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 2007–2023.
- Scalable verified training for provably robust image classification. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4842–4851.
- Poseidon: A new hash function for {{\{{Zero-Knowledge}}\}} proof systems. In 30th USENIX Security Symposium (USENIX Security 21). 519–535.
- Are you stealing my model? sample correlation for fingerprinting deep neural networks. Advances in Neural Information Processing Systems 35 (2022), 36571–36584.
- VeriFL: Communication-Efficient and Fast Verifiable Aggregation for Federated Learning. IEEE Transactions on Information Forensics and Security 16 (2021), 1736–1751. https://doi.org/10.1109/TIFS.2020.3043139
- Entangled watermarks as a defense against model extraction. In USENIX Security 21. 1937–1954.
- Proof-of-learning: Definitions and practice. In 2021 IEEE SP. IEEE, 1039–1056.
- John Kuszmaul. 2019. Verkle trees. Verkle Trees 1 (2019), 1.
- Replicated state machines without replicated execution. In 2020 IEEE SP. IEEE, 119–134.
- Defending against model stealing via verifying embedded external features. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1464–1472.
- How to steer your adversary: Targeted and efficient model stealing defenses with gradient redirection. In International Conference on Machine Learning. PMLR, 15241–15254.
- Ralph C Merkle. 1987. A digital signature based on a conventional encryption function. In Conference on the theory and application of cryptographic techniques. Springer, 369–378.
- Adversarial learning targeting deep neural network classification: A comprehensive review of defenses against attacks. Proc. IEEE 108, 3 (2020), 402–433.
- Toward Verifiable and Privacy Preserving Machine Learning Prediction. IEEE Transactions on Dependable and Secure Computing 19, 3 (2022), 1703–1721.
- Scaling verifiable computation using efficient set accumulators. In USENIX Security 20. 2075–2092.
- Poison forensics: Traceback of data poisoning attacks in neural networks. In USENIX Security 22. 3575–3592.
- Hyperproofs: Aggregating and maintaining proofs in vector commitments. In USENIX Security 22. 3001–3018.
- DENet: Disentangled Embedding Network for Visible Watermark Removal. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 2411–2419.
- Towards scalable threshold cryptosystems. In 2020 IEEE SP. IEEE, 877–893.
- Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In 2019 IEEE SP. IEEE, 707–723.
- On Function-Coupled Watermarks for Deep Neural Networks. arXiv preprint arXiv:2302.10296 (2023).
- Minibatch vs local sgd for heterogeneous distributed learning. Advances in Neural Information Processing Systems 33 (2020), 6281–6292.
- Libra: Succinct zero-knowledge proofs with optimal prover computation. In Advances in Cryptology–CRYPTO 2019: 39th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 18–22, 2019, Proceedings, Part III 39. Springer, 733–764.
- Liang Feng Zhang and Huaxiong Wang. 2022. Multi-server verifiable computation of low-degree polynomials. In 2022 IEEE SP. IEEE, 596–613.
- “Adversarial Examples” for Proof-of-Learning. In 2022 IEEE SP. IEEE, 1408–1422.
- Dynamic Pricing and Placing for Distributed Machine Learning Jobs: An Online Learning Approach. IEEE Journal on Selected Areas in Communications 41, 4 (2023), 1135–1150.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.