- The paper introduces a non-intrusive fingerprinting framework that embeds semantically coherent watermarks to verify LVLM ownership.
- It employs Semantic-Aligned Fingerprint Distillation with imperceptible visual perturbations to withstand quantization and fine-tuning attacks.
- Robust-Fingerprint Optimization simulates worst-case scenarios, ensuring watermark detection remains stable across diverse model modifications.
SIF: Semantically In-Distribution Fingerprints for Large Vision-LLMs
Introduction
The proliferation of publicly accessible Large Vision-LLMs (LVLMs) has amplified concerns regarding unauthorized model reuse and intellectual property (IP) violations. Existing model fingerprinting and ownership verification techniques for open-source LVLMs typically rely on semantically anomalous or out-of-distribution triggers and responses. These approaches exhibit inherent vulnerabilities: adversaries can efficiently detect and filter such fingerprints, rendering them ineffective under realistic attack settings. The "SIF: Semantically In-Distribution Fingerprints for Large Vision-LLMs" (2604.17041) paper presents a comprehensive analysis of these vulnerabilities and introduces SIF, a non-intrusive fingerprinting framework that addresses stealthiness and robustness, enabling reliable LVLM copyright verification.
Figure 1: Comparison between existing LVLM fingerprinting methods and SIF. SIF produces in-distribution, semantically coherent fingerprints, while prior approaches use semantically abnormal triggers and responses.
Analysis of Fingerprinting Vulnerabilities
The paper first formalizes the threat posed by adversaries who can access and deploy open-source LVLMs as black-box APIs, monitor queries, alter responses, or fine-tune or quantize the model. Within this context, existing fingerprinting strategies (e.g., Instruction Fingerprint, Proflingo, PLA) are shown to inject semantic irregularitiesโeither through unnatural trigger inputs or outputs not aligned with the input semantics. To systematically expose this flaw, the Semantic Divergence Attack (SDA) is proposed. SDA utilizes a reference model to detect queries and responses that exhibit abnormal perplexity or semantic divergence compared to expected distributions, filtering or replacing them to remove fingerprinting signalsโrendering current state-of-the-art approaches fragile in adversarial settings.
Figure 2: SDA leverages a reference LVLM to detect suspicious fingerprint queries by measuring input perplexity and output divergence for robust fingerprint removal.
SIF: Methodology Overview
SIF is designed as a non-intrusive black-box fingerprinting and ownership verification scheme for LVLMs, specifically targeting the deficiencies in stealth and robustness found in previous schemes.
Semantic-Aligned Fingerprint Distillation (SAFD)
SIF adapts decoding-based text watermarking techniquesโtraditionally used for content provenanceโinto the vision modality. In SAFD, imperceptible visual perturbations are added to natural images, producing trigger images conditioned to induce the generation of text responses that contain a statistically verifiable watermark. Crucially, these responses remain semantically coherent, preserving naturalness and in-distribution characteristics. The optimization objective balances watermark strength and cross-entropy with the teacher response, ensuring both detectability and semantic fidelity.
Figure 3: (a) Fingerprint construction with SAFD and RFO. (b) Copyright verification: a trigger image embedded with a watermark induces detectable fingerprint responses in the suspect LVLM, verifying ownership.
Robust-Fingerprint Optimization (RFO)
SIF introduces RFO to counteract post-release model modifications such as quantization and fine-tuning, which typically cause representation drift and deteriorate fingerprint reliability. RFO simulates worst-case embedding perturbations during the optimization process, producing trigger images whose ownership watermark remains detectable despite representation space shifts.
Experimental Results
Comprehensive experiments are conducted on LLaVA-1.5 and Qwen2.5-VL, encompassing quantization, full fine-tuning, input/output perturbations, and black-box adversarial settings. SIF is benchmarked against major prior art (e.g., IF, Proflingo, PLA).
Robustness to Model Modifications
SIF achieves superior Fingerprint Matching Rate (FMR) under quantization and fine-tuning. For LLaVA-1.5-7B, SIF attains FMR {0.49,ย 0.89} under 4/8-bit quantization, and remains effective under aggressive fine-tuning (e.g., LlavaMix, TikZ, MathV). Qwen2.5-VL results show similar trends, with SIF maintaining FMR โฅ0.88 post-quantization and strong reliability across diverse downstream tasks. These results confirm that SIFโs fingerprints are robust to both parameter compression and heavy model adaptation.
Figure 4: SIFโs FMR across perturbation budgets and optimization steps, demonstrating stable robustness over Qwen2.5-VL-7B.
Stealthiness and Detection Resistance
SIF fingerprints are statistically indistinguishable from normal user queries by perplexity metrics and semantic similarity. Under the SDA framework with a reference LVLM, SIF achieves the highest FMR (see Table 2 in the paper), while prior methods are effectively neutralized. Notably, only 4% of SIF interactions are flagged as suspicious by a strong semantic judge (GPT-4.1), versus 86% for adversarial triggers like PLA, confirming SIFโs stealthiness.
Figure 5: The SIF fingerprint query induces a semantically natural response in the suspect model carrying the watermark signal, whereas an unrelated model does not.
Reliability and Specificity
Evaluations on unrelated LVLMs from different model families confirm that SIF fingerprints do not falsely activate (FMR = 0), producing a clear detection gapโallaying concerns over false positive copyright claims.
Mechanistic and Implementation Insights
Watermark Construction
SIF leverages a uni-gram text watermarking key to construct a token-level watermark signal embedded by slight biases in generation logits, detectable by a z-score hypothesis test over the frequency of watermark tokens in output.
Trigger Image Generation
The trigger construction employs Projected Gradient Descent (PGD) with perturbation budgets (e.g., ฯต=16/255), ensuring visual imperceptibility while consistently activating the semantic-aligned watermark under standard decoding.
Attack Resilience
SIF demonstrates resilience against state-of-the-art watermark removal attacks by virtue of query-level secret keys, minimal fingerprint queries, and inherent blending with user behavior, limiting adversaries' capacity to mount statistical removal or filtering attacks.
Implications and Future Directions
SIF's semantically in-distribution, non-intrusive fingerprinting mechanism significantly narrows the gap in reliable LVLM copyright verification for open-source distribution contexts. By removing the reliance on model parameter access and utilizing semantic-coherent, robust trigger queries, SIF establishes a practical, effective tool for ownership screening in adversarial and commercial exploitation settings.
Potential future directions include extension to video and image generation models, which will require new strategies to encode temporally consistent watermarks. SIF also motivates further research into defensive mechanisms for generalized robust, undetectable fingerprinting in foundation models. However, as acknowledged by the authors, SIF alone is not sufficient for legal proof of ownership, and should be complemented by additional documentation if used in litigation or formal disputes.
Conclusion
SIF (Semantically In-Distribution Fingerprints) establishes a robust, stealthy, and non-intrusive framework for LVLM ownership verification. It outperforms prior fingerprinting schemes in resilience against adversarial removal, quantization, fine-tuning, and input/output perturbations, while maintaining undetectability to semantic divergence attacks and adversarial filtering. The methodology substantially advances the practical state of large-scale model copyright protection, with clear theoretical and applied research implications for the broader AI security and IP tracking landscape.