Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service (2311.05863v1)

Published 10 Nov 2023 in cs.CR and cs.CV

Abstract: Recent advances in vision-language pre-trained models (VLPs) have significantly increased visual understanding and cross-modal analysis capabilities. Companies have emerged to provide multi-modal Embedding as a Service (EaaS) based on VLPs (e.g., CLIP-based VLPs), which cost a large amount of training data and resources for high-performance service. However, existing studies indicate that EaaS is vulnerable to model extraction attacks that induce great loss for the owners of VLPs. Protecting the intellectual property and commercial ownership of VLPs is increasingly crucial yet challenging. A major solution of watermarking model for EaaS implants a backdoor in the model by inserting verifiable trigger embeddings into texts, but it is only applicable for LLMs and is unrealistic due to data and model privacy. In this paper, we propose a safe and robust backdoor-based embedding watermarking method for VLPs called VLPMarker. VLPMarker utilizes embedding orthogonal transformation to effectively inject triggers into the VLPs without interfering with the model parameters, which achieves high-quality copyright verification and minimal impact on model performance. To enhance the watermark robustness, we further propose a collaborative copyright verification strategy based on both backdoor trigger and embedding distribution, enhancing resilience against various attacks. We increase the watermark practicality via an out-of-distribution trigger selection approach, removing access to the model training data and thus making it possible for many real-world scenarios. Our extensive experiments on various datasets indicate that the proposed watermarking approach is effective and safe for verifying the copyright of VLPs for multi-modal EaaS and robust against model extraction attacks. Our code is available at https://github.com/Pter61/vlpmarker.

Citations (3)

View on Semantic Scholar

Summary

The paper presents VLPMarker, a backdoor-based watermarking method that injects an orthogonal transformation into VLPs without altering model parameters.
It employs a dual-verification mechanism combining trigger verification with embedding distribution assessment to defend against extraction and tampering attacks.
The method leverages out-of-distribution triggers to enhance real-world applicability while maintaining robust performance in multi-modal tasks.

Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service

The proliferation of Vision-Language Pre-trained Models (VLPs), notably CLIP-based architectures, has resulted in significant advancements in visual understanding and multi-modal tasks ranging from image classification to cross-modal retrieval. These models have enabled companies to offer multi-modal Embedding as a Service (EaaS), facilitating diverse applications across sectors such as e-commerce and internet search. However, concerns surrounding model extraction attacks pose substantial risks, equipping unauthorized parties with capabilities to clone these models and cause financial loss to VLP providers. Hence, a robust strategy for intellectual property protection is imperative to safeguard proprietary models.

The paper introduces VLPMarker, a backdoor-based watermarking method for VLPs, with the objective of fortifying them against unauthorized extraction while maintaining operational integrity. Unlike conventional methods that rely predominantly on model parameters or trigger embedding in LLMs, VLPMarker employs embedding orthogonal transformation, a potent technique that integrates backdoors into VLPs without necessitating parameter modification. This ensures that the model's performance and the transformation correlations between modalities remain unaffected, thereby maintaining consistency in high-caliber tasks.

Technical Contributions:

Backdoor-based Watermarking Method: VLPMarker is designed to inject a linear orthogonal transformation matrix into pre-trained VLP networks. This process implants verifiable triggers within the model while maintaining the integrity of the VLPs. Notably, the transformation does not disturb existing model parameters, providing a non-intrusive approach to embedding.
Collaborative Copyright Verification: The paper proposes a novel copyright verification mechanism that synergizes backdoor trigger verification with embedding distribution assessment. This dual-verification strategy enhances the resilience of the watermark against a spectrum of adversarial attacks, including model extraction and tampering.
Out-of-Distribution Trigger Selection: To bolster the applicability of VLPMarker in real-world settings, the trigger selection leverages out-of-distribution (OoD) samples, which significantly diminishes reliance on model training data. This choice of trigger set not only reinforces watermark robustness but also alleviates privacy concerns.

Empirical Results and Observations:

Comprehensive experiments underscore the efficacy and safety of VLPMarker across various datasets. The watermarking method demonstrated minimal impact on the utility of VLPs, preserving their performance in image-text retrieval and classification tasks. Besides, the watermark's resilience against extraction and similarity-invariant attacks was confirmed, showing superior detection performance compared to existing single-modal methods. Notably, VLPMarker remains effective even when faced with complex, similarity-preserving transformations, ensuring reliable copyright infringement detection.

Theoretical and Practical Implications:

VLPMarker's framework provides the dual advantage of safeguarding model ownership and maintaining performance efficacy. This method exemplifies a forward-thinking approach in embedding as a service, offering a scalable solution that can gracefully accommodate continuous technological evolutions in ensemble models. Practically, the deployment of such watermarking solutions could pivot the security landscape for AI services, ensuring data integrity, and upholding proprietary stakes in competitive markets.

Future Directions:

Future research may aim to refine the efficiency of the transformation-based watermarking technique, minimizing computational overhead and exploring its adaptability to an expanded range of downstream applications. The security of EaaS could further benefit from combining watermarking with other AI safety mechanisms, potentially creating a comprehensive suite tailored for dynamic threat environments.

In sum, VLPMarker represents a significant contribution to the domain of AI model protection, providing robust mechanisms to ensure that vision-LLMs remain proprietary in the increasingly vulnerable landscape of multi-modal services.

PDF Markdown

Related Papers

GitHub

GitHub - Pter61/vlpmarker (15 stars)

YouTube

Show All Videos