OVLA: Neural Network Ownership Verification using Latent Watermarks (2306.13215v2)
Abstract: Ownership verification for neural networks is important for protecting these models from illegal copying, free-riding, re-distribution and other intellectual property misuse. We present a novel methodology for neural network ownership verification based on the notion of latent watermarks. Existing ownership verification methods either modify or introduce constraints to the neural network parameters, which are accessible to an attacker in a white-box attack and can be harmful to the network's normal operation, or train the network to respond to specific watermarks in the inputs similar to data poisoning-based backdoor attacks, which are susceptible to backdoor removal techniques. In this paper, we address these problems by decoupling a network's normal operation from its responses to watermarked inputs during ownership verification. The key idea is to train the network such that the watermarks remain dormant unless the owner's secret key is applied to activate it. The secret key is realized as a specific perturbation only known to the owner to the network's parameters. We show that our approach offers strong defense against backdoor detection, backdoor removal and surrogate model attacks.In addition, our method provides protection against ambiguity attacks where the attacker either tries to guess the secret weight key or uses fine-tuning to embed their own watermarks with a different key into a pre-trained neural network. Experimental results demonstrate the advantages and effectiveness of our proposed approach.
- Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems, 25:1097–1105, 2012.
- Improving protein fold recognition by deep learning networks. Scientific reports, 5(1):1–11, 2015.
- Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
- Gpt-3: Its nature, scope, limits, and consequences. Minds and Machines, 30(4):681–694, 2020.
- Carbon emissions and large neural network training. arXiv preprint arXiv:2104.10350, 2021.
- Samuel O Sada. Improving the predictive accuracy of artificial neural network (ann) approach in a mild steel turning operation. The International Journal of Advanced Manufacturing Technology, 112(9):2389–2398, 2021.
- Embedding watermarks into deep neural networks. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pages 269–277, 2017.
- Digital watermarking for deep neural networks. International Journal of Multimedia Information Retrieval, 7(1):3–16, 2018.
- Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In 27th USENIX Security Symposium (USENIX Security 18), pages 1615–1631, 2018.
- Protecting intellectual property of deep neural networks with watermarking. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, pages 159–172, 2018.
- Undistillable: Making a nasty teacher that cannot teach students. arXiv preprint arXiv:2105.07381, 2021.
- Non-transferable learning: A new approach for model ownership verification and applicability authorization. In International Conference on Learning Representations, 2021.
- Deepmarks: A secure fingerprinting framework for digital rights management of deep learning models. In Proceedings of the 2019 on International Conference on Multimedia Retrieval, pages 105–113, 2019.
- A survey of model compression and acceleration for deep neural networks. arXiv preprint arXiv:1710.09282, 2017.
- Attacks on digital watermarks for deep neural networks. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2622–2626. IEEE, 2019a.
- Robust and undetectable white-box watermarks for deep neural networks. arXiv preprint arXiv:1910.14268, 1(2), 2019b.
- You are caught stealing my winning lottery ticket! making a lottery ticket claim its ownership. Advances in Neural Information Processing Systems, 34, 2021a.
- Robust watermarking for deep neural networks via bi-level optimization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14841–14850, 2021.
- Badnets: Identifying vulnerabilities in the machine learning model supply chain. arXiv preprint arXiv:1708.06733, 2017.
- Adversarial frontier stitching for remote neural network watermarking. Neural Computing and Applications, 32(13):9233–9244, 2020.
- Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks. Advances in Neural Information Processing Systems, 32, 2019.
- Franziska Boenisch. A survey on model watermarking neural networks. arXiv preprint arXiv:2009.12153, 2020.
- Black-box detection of backdoor attacks with limited information and data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 16482–16491, 2021.
- Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In 2019 IEEE Symposium on Security and Privacy (SP), pages 707–723. IEEE, 2019.
- Resolving rightful ownerships with invisible watermarking techniques: Limitations, attacks, and implications. IEEE Journal on Selected areas in Communications, 16(4):573–586, 1998.
- Passport-aware normalization for deep model protection. Advances in Neural Information Processing Systems, 33:22619–22628, 2020.
- Protecting intellectual property of generative adversarial networks from ambiguity attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3630–3639, 2021.
- Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems, 33:2958–2969, 2020.
- Tbt: Targeted neural network attack with bit trojan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13198–13207, 2020.
- Proflip: Targeted trojan attack with progressive bit flips. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7718–7727, 2021b.
- Can adversarial weight perturbations inject neural backdoors. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pages 2029–2032, 2020.
- Hibernated backdoor: A mutual information empowered backdoor attack to deep neural networks. 2022.
- A survey of deep neural network watermarking techniques. Neurocomputing, 461:171–193, 2021.
- Knowledge distillation: A survey. International Journal of Computer Vision, 129(6):1789–1819, 2021.
- Free-riders in federated learning: Attacks and defenses. arXiv preprint arXiv:1911.12560, 2019.
- O Goldreich. Foundations of cryptography: Volume 1, basic tools, 2001. Google Scholar Google Scholar Digital Library Digital Library.
- Yann LeCun. The mnist database of handwritten digits. http://yann. lecun. com/exdb/mnist/, 1998.
- Learning multiple layers of features from tiny images. 2009.
- Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks, pages –, 2012. ISSN 0893-6080. doi:10.1016/j.neunet.2012.02.016. URL http://www.sciencedirect.com/science/article/pii/S0893608012000457.
- An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458, 2015.
- Trojanzoo: Towards unified, holistic, and practical evaluation of neural backdoors. In Proceedings of IEEE European Symposium on Security and Privacy (Euro S&P), 2022.
- Universal adversarial perturbations. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1765–1773, 2017.
- Kurt Hornik. Approximation capabilities of multilayer feedforward networks. Neural networks, 4(2):251–257, 1991.
- Feisi Fu (4 papers)
- Wenchao Li (48 papers)