How to Prove Your Model Belongs to You: A Blind-Watermark based Framework to Protect Intellectual Property of DNN (1903.01743v4)

Published 5 Mar 2019 in cs.CR, cs.LG, and cs.MM

Abstract: Deep learning techniques have made tremendous progress in a variety of challenging tasks, such as image recognition and machine translation, during the past decade. Training deep neural networks is computationally expensive and requires both human and intellectual resources. Therefore, it is necessary to protect the intellectual property of the model and externally verify the ownership of the model. However, previous studies either fail to defend against the evasion attack or have not explicitly dealt with fraudulent claims of ownership by adversaries. Furthermore, they can not establish a clear association between the model and the creator's identity. To fill these gaps, in this paper, we propose a novel intellectual property protection (IPP) framework based on blind-watermark for watermarking deep neural networks that meet the requirements of security and feasibility. Our framework accepts ordinary samples and the exclusive logo as inputs, outputting newly generated samples as watermarks, which are almost indistinguishable from the origin, and infuses these watermarks into DNN models by assigning specific labels, leaving the backdoor as the basis for our copyright claim. We evaluated our IPP framework on two benchmark datasets and 15 popular deep learning models. The results show that our framework successfully verifies the ownership of all the models without a noticeable impact on their primary task. Most importantly, we are the first to successfully design and implement a blind-watermark based framework, which can achieve state-of-art performances on undetectability against evasion attack and unforgeability against fraudulent claims of ownership. Further, our framework shows remarkable robustness and establishes a clear association between the model and the author's identity.

Citations (160)

View on Semantic Scholar

Summary

The paper proposes a blind-watermark framework to securely embed intellectual property into Deep Neural Networks (DNNs) for ownership verification.
The framework includes an encoder and discriminator designed to embed watermarks subtly without affecting model accuracy or performance on primary tasks.
Evaluation on MNIST and CIFAR-10 datasets demonstrated negligible accuracy loss and robust resistance against evasion attacks and fraudulent ownership claims.

Intellectual Property Protection in Deep Neural Networks via Blind Watermarking

The paper "How to Prove Your Model Belongs to You: A Blind-Watermark based Framework to Protect Intellectual Property of DNN" addresses the growing challenge of intellectual property protection (IPP) in deep neural networks (DNNs). Authors Zheng Li et al. propose a novel technique for securely embedding watermarks into DNNs, leveraging a blind-watermarking framework that has significant implications for IP protection in machine learning models.

Motivation and Background

Deep learning models have become central to modern AI applications, requiring extensive computational and data resources for training. The models, once developed, contain intellectual properties that need protection from theft and unauthorized usage. Despite existing legal frameworks, technical means are critical for authenticating ownership due to the ease with which models can be copied and deployed elsewhere. Previous approaches for embedding watermarks in DNNs exhibited vulnerabilities, such as susceptibility to evasion attacks or failure to conclusively associate models with their creators.

Proposed Framework

The researchers propose a blind-watermark based IPP framework that aims to meet multiple critical requirements, including security, undetectability, robustness, and the ability to associate a model with its legitimate owner. The framework consists of:

Encoder: Generates watermarked samples indistinguishable from ordinary samples. Its design ensures the watermark is embedded subtly, avoiding detectable patterns that adversaries could exploit.
Discriminator: Operates in tandem with the encoder, helping refine watermark generation to ensure transparency and robustness against detection.
Host DNN: The watermark is ingrained in the model, allowing owners to verify model ownership via specific queries that result in predefined outputs.

Evaluation

The proposed watermarking strategy was evaluated on two benchmark datasets—MNIST and CIFAR-10—across 15 DNN architectures. Results demonstrated that embedded watermarks did not adversely affect model accuracy on primary tasks, with negligible performance loss observed. Furthermore, the framework effectively verified model ownership while resisting several attack vectors, such as evasion attacks and fraudulent ownership claims.

Security and Functionality Advantages

The framework exhibits notable security properties. It achieves a high level of stealth against evasion attacks, which are designed to bypass watermark detection by altering query samples. Additionally, the associated encoder's watermark logic ensures that only genuine owners possessing the original encoder can replicate watermark embedding, posing a significant challenge to potential counterfeiters.

Theoretical and Practical Implications

From a theoretical perspective, this work extends the applicability of watermarking techniques to DNNs, which inherently possess more complex structures than conventional digital media. Practically, its robust watermarking offers a reliable mechanism for stakeholders to ensure proprietary control over their AI models, a crucial factor in industrial settings where models represent competitive advantages and significant financial investments.

Future Directions and Conclusions

The research sets the foundation for further explorations into advanced security mechanisms within machine learning. Future work might extend the framework to other architectures, such as recurrent neural networks, and explore the adaptability of the framework to additional adversarial settings in AI infrastructure. Through these developments, IP protection for AI models can be continually enhanced, reinforcing trust and security in AI deployments.

In summary, the blind-watermark based framework presents a significant advancement in protecting the intellectual assets tied to DNNs, thereby addressing pressing concerns over model security and ownership verification.