SIF: Semantically In-Distribution Fingerprints for Large Vision-Language Models

Published 18 Apr 2026 in cs.CV | (2604.17041v1)

Abstract: The public accessibility of large vision-LLMs (LVLMs) raises serious concerns about unauthorized model reuse and intellectual property infringement. Existing ownership verification methods often rely on semantically abnormal queries or out-of-distribution responses as fingerprints, which can be easily detected and removed by adversaries. We expose this vulnerability through a Semantic Divergence Attack (SDA), which identifies and filters fingerprint queries by measuring semantic divergence between a suspect model and a reference model, showing that existing fingerprints are not semantic-preserving and are therefore easy to detect and bypass. To address these limitations, we propose SIF (Semantically In-Distribution Fingerprints), a non-intrusive ownership verification framework that requires no parameter modification. SIF introduces Semantic-Aligned Fingerprint Distillation (SAFD), which transfers text watermarking signals into the visual modality to produce semantically coherent yet fingerprinted responses. In addition, Robust-Fingerprint Optimization (RFO) enhances robustness by simulating worst-case representation perturbations, making the fingerprints resilient to model modifications such as fine-tuning and quantization. Extensive experiments on LLaVA-1.5 and Qwen2.5-VL demonstrate that SIF achieves strong stealthiness and robustness, providing a practical solution for LVLM copyright protection. Code is available at https://github.com/UCF-ML-Research/SIF-VLM-Fingerprint

Abstract PDF Upgrade to Chat

Authors (3)

Summary

The paper introduces a non-intrusive fingerprinting framework that embeds semantically coherent watermarks to verify LVLM ownership.
It employs Semantic-Aligned Fingerprint Distillation with imperceptible visual perturbations to withstand quantization and fine-tuning attacks.
Robust-Fingerprint Optimization simulates worst-case scenarios, ensuring watermark detection remains stable across diverse model modifications.

SIF: Semantically In-Distribution Fingerprints for Large Vision-LLMs

Introduction

The proliferation of publicly accessible Large Vision-LLMs (LVLMs) has amplified concerns regarding unauthorized model reuse and intellectual property (IP) violations. Existing model fingerprinting and ownership verification techniques for open-source LVLMs typically rely on semantically anomalous or out-of-distribution triggers and responses. These approaches exhibit inherent vulnerabilities: adversaries can efficiently detect and filter such fingerprints, rendering them ineffective under realistic attack settings. The "SIF: Semantically In-Distribution Fingerprints for Large Vision-LLMs" (2604.17041) paper presents a comprehensive analysis of these vulnerabilities and introduces SIF, a non-intrusive fingerprinting framework that addresses stealthiness and robustness, enabling reliable LVLM copyright verification.

Figure 1: Comparison between existing LVLM fingerprinting methods and SIF. SIF produces in-distribution, semantically coherent fingerprints, while prior approaches use semantically abnormal triggers and responses.

Analysis of Fingerprinting Vulnerabilities

The paper first formalizes the threat posed by adversaries who can access and deploy open-source LVLMs as black-box APIs, monitor queries, alter responses, or fine-tune or quantize the model. Within this context, existing fingerprinting strategies (e.g., Instruction Fingerprint, Proflingo, PLA) are shown to inject semantic irregularities—either through unnatural trigger inputs or outputs not aligned with the input semantics. To systematically expose this flaw, the Semantic Divergence Attack (SDA) is proposed. SDA utilizes a reference model to detect queries and responses that exhibit abnormal perplexity or semantic divergence compared to expected distributions, filtering or replacing them to remove fingerprinting signals—rendering current state-of-the-art approaches fragile in adversarial settings.

Figure 2: SDA leverages a reference LVLM to detect suspicious fingerprint queries by measuring input perplexity and output divergence for robust fingerprint removal.

SIF: Methodology Overview

SIF is designed as a non-intrusive black-box fingerprinting and ownership verification scheme for LVLMs, specifically targeting the deficiencies in stealth and robustness found in previous schemes.

Semantic-Aligned Fingerprint Distillation (SAFD)

SIF adapts decoding-based text watermarking techniques—traditionally used for content provenance—into the vision modality. In SAFD, imperceptible visual perturbations are added to natural images, producing trigger images conditioned to induce the generation of text responses that contain a statistically verifiable watermark. Crucially, these responses remain semantically coherent, preserving naturalness and in-distribution characteristics. The optimization objective balances watermark strength and cross-entropy with the teacher response, ensuring both detectability and semantic fidelity.

Figure 3: (a) Fingerprint construction with SAFD and RFO. (b) Copyright verification: a trigger image embedded with a watermark induces detectable fingerprint responses in the suspect LVLM, verifying ownership.

Robust-Fingerprint Optimization (RFO)

SIF introduces RFO to counteract post-release model modifications such as quantization and fine-tuning, which typically cause representation drift and deteriorate fingerprint reliability. RFO simulates worst-case embedding perturbations during the optimization process, producing trigger images whose ownership watermark remains detectable despite representation space shifts.

Experimental Results

Comprehensive experiments are conducted on LLaVA-1.5 and Qwen2.5-VL, encompassing quantization, full fine-tuning, input/output perturbations, and black-box adversarial settings. SIF is benchmarked against major prior art (e.g., IF, Proflingo, PLA).

Robustness to Model Modifications

SIF achieves superior Fingerprint Matching Rate (FMR) under quantization and fine-tuning. For LLaVA-1.5-7B, SIF attains FMR $\{0.49,~0.89\}$ under 4/8-bit quantization, and remains effective under aggressive fine-tuning (e.g., LlavaMix, TikZ, MathV). Qwen2.5-VL results show similar trends, with SIF maintaining FMR $\geq 0.88$ post-quantization and strong reliability across diverse downstream tasks. These results confirm that SIF’s fingerprints are robust to both parameter compression and heavy model adaptation.

Figure 4: SIF’s FMR across perturbation budgets and optimization steps, demonstrating stable robustness over Qwen2.5-VL-7B.

Stealthiness and Detection Resistance

SIF fingerprints are statistically indistinguishable from normal user queries by perplexity metrics and semantic similarity. Under the SDA framework with a reference LVLM, SIF achieves the highest FMR (see Table 2 in the paper), while prior methods are effectively neutralized. Notably, only 4% of SIF interactions are flagged as suspicious by a strong semantic judge (GPT-4.1), versus 86% for adversarial triggers like PLA, confirming SIF’s stealthiness.

Figure 5: The SIF fingerprint query induces a semantically natural response in the suspect model carrying the watermark signal, whereas an unrelated model does not.

Reliability and Specificity

Evaluations on unrelated LVLMs from different model families confirm that SIF fingerprints do not falsely activate (FMR = 0), producing a clear detection gap—allaying concerns over false positive copyright claims.

Mechanistic and Implementation Insights

Watermark Construction

SIF leverages a uni-gram text watermarking key to construct a token-level watermark signal embedded by slight biases in generation logits, detectable by a z-score hypothesis test over the frequency of watermark tokens in output.

Trigger Image Generation

The trigger construction employs Projected Gradient Descent (PGD) with perturbation budgets (e.g., $\epsilon=16/255$ ), ensuring visual imperceptibility while consistently activating the semantic-aligned watermark under standard decoding.

Attack Resilience

SIF demonstrates resilience against state-of-the-art watermark removal attacks by virtue of query-level secret keys, minimal fingerprint queries, and inherent blending with user behavior, limiting adversaries' capacity to mount statistical removal or filtering attacks.

Implications and Future Directions

SIF's semantically in-distribution, non-intrusive fingerprinting mechanism significantly narrows the gap in reliable LVLM copyright verification for open-source distribution contexts. By removing the reliance on model parameter access and utilizing semantic-coherent, robust trigger queries, SIF establishes a practical, effective tool for ownership screening in adversarial and commercial exploitation settings.

Potential future directions include extension to video and image generation models, which will require new strategies to encode temporally consistent watermarks. SIF also motivates further research into defensive mechanisms for generalized robust, undetectable fingerprinting in foundation models. However, as acknowledged by the authors, SIF alone is not sufficient for legal proof of ownership, and should be complemented by additional documentation if used in litigation or formal disputes.

Conclusion

SIF (Semantically In-Distribution Fingerprints) establishes a robust, stealthy, and non-intrusive framework for LVLM ownership verification. It outperforms prior fingerprinting schemes in resilience against adversarial removal, quantization, fine-tuning, and input/output perturbations, while maintaining undetectability to semantic divergence attacks and adversarial filtering. The methodology substantially advances the practical state of large-scale model copyright protection, with clear theoretical and applied research implications for the broader AI security and IP tracking landscape.

Markdown Report Issue