Elevating Defenses: Bridging Adversarial Training and Watermarking for Model Resilience (2312.14260v2)

Published 21 Dec 2023 in cs.LG and cs.CR

Abstract: Machine learning models are being used in an increasing number of critical applications; thus, securing their integrity and ownership is critical. Recent studies observed that adversarial training and watermarking have a conflicting interaction. This work introduces a novel framework to integrate adversarial training with watermarking techniques to fortify against evasion attacks and provide confident model verification in case of intellectual property theft. We use adversarial training together with adversarial watermarks to train a robust watermarked model. The key intuition is to use a higher perturbation budget to generate adversarial watermarks compared to the budget used for adversarial training, thus avoiding conflict. We use the MNIST and Fashion-MNIST datasets to evaluate our proposed technique on various model stealing attacks. The results obtained consistently outperform the existing baseline in terms of robustness performance and further prove the resilience of this defense against pruning and fine-tuning removal attacks.

References (15)

Authors (3)

Janvi Thakkar (6 papers)
Giulio Zizzo (25 papers)
Sergio Maffeis (14 papers)

Citations (1)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Elevating Defenses: Bridging Adversarial Training and Watermarking for Model Resilience (2312.14260v2)

Summary

Related Papers