Papers
Topics
Authors
Recent
Search
2000 character limit reached

SLImE: Semantic Leakage from Image Embeddings

Updated 6 February 2026
  • SLImE is a phenomenon where compressed image embeddings inadvertently retain rich semantic structure, allowing inference of tags, captions, and scene graphs without full image recovery.
  • It utilizes a two-stage pipeline that aligns victim embeddings to an attack space via a linear mapping, enabling effective retrieval of semantic neighborhoods even with minimal alignment samples.
  • Empirical evaluations reveal high semantic recovery metrics across multiple models, prompting research into mitigation strategies like differential privacy, watermarking, and embedding sanitization.

Semantic Leakage from Image Embeddings (SLImE) designates the phenomenon whereby compressed image embeddings, absent access to the original images or encoder, still expose substantial semantic structure identifiable via standalone analysis. SLImE formalizes an attack scenario in which alignment and retrieval operations on image embeddings enable the inference of objects, relationships, and even grammatically coherent descriptions. The critical vulnerability lies in the preservation of semantic neighborhoods under linear or nonlinear mappings, facilitating the propagation of semantic content through sequences of lossy transformations. This mechanism renders image embeddings intrinsically susceptible to privacy risks regardless of pixel-level invertibility or downstream task specialization (Chen et al., 30 Jan 2026).

1. Formalization of Semantic Leakage

Consider an image encoding scheme fV:X→Rmf_V: X \to \mathbb{R}^m, mapping images x∈Xx \in X to mm-dimensional, L2-normalized embeddings in a "victim" space VV. Let fAf_A denote an "attack" encoder producing nn-dimensional attack-space embeddings. Semantic leakage is defined as the ability to reconstruct semantic content (e.g., tags or captions) from fV(x)f_V(x) by mapping into AA and using retrieval or generation methods, without inverting the embedding to recover xx itself.

Core Definitions

  • Linear Alignment: The attacker fits a linear mapping W∈Rm×nW \in \mathbb{R}^{m \times n} such that, for any x∈Xx \in X0,

x∈Xx \in X1

with x∈Xx \in X2 given by the Moore–Penrose pseudoinverse:

x∈Xx \in X3

over a small set of aligned pairs.

  • Semantic Neighborhoods: For a tag vocabulary x∈Xx \in X4 with embeddings x∈Xx \in X5, the x∈Xx \in X6-neighborhood of a tag x∈Xx \in X7 is

x∈Xx \in X8

where x∈Xx \in X9 is cosine similarity.

  • Semantic Neighborhood Preservation: After alignment, for each image mm0, the set of Top-mm1 tags retrieved from mm2 (denoted mm3) is said to preserve neighborhoods at scale mm4 if every tag in mm5 falls within the mm6-neighborhood of a reference tag mm7 (the Top-mm8 tags from mm9).

Leakage Proposition

The intrinsic vulnerability arises when local semantic neighborhood structure is preserved under VV0; this alone suffices to reconstruct meaningful high-level semantics even when exact image or label recovery is impossible (Chen et al., 30 Jan 2026).

2. The Few-TEI Inference Framework

The Few-TEI framework operationalizes semantic leakage via a two-stage pipeline:

Stage 1 – Training a Local Retriever:

  1. Parse captions to structured tags (relational and attribute tuples) using a public (image,caption) corpus.
  2. Contrastively align images and tag embeddings using a loss of the form:

VV1

where VV2.

  1. Train a ranking module (DCN v2) on interaction features to promote hard negative discrimination.

Stage 2 – Inference and Attacks:

  • Align victim embeddings to attack space via VV3.
  • For each VV4, compute VV5.
  • Retrieve Top-VV6 tags VV7.
  • Feed tags to an LLM or VLM to generate grammatical captions or structured scene graphs.
  • Optionally, pass VV8 to a diffusion model to synthesize a low-fidelity image.
  • Apply adaptive vision-language attacks by extracting detected objects, relations, and scene graphs from LLM/VLM outputs.

All steps operate solely on the standalone embeddings, without task-specific decoders or direct access to original pixels.

3. Empirical Evaluation and Observed Leakage

SLImE has been validated across multiple widely used embedding models—proprietary (GEMINI, Cohere) and open-source (Nomic, CLIP)—and diverse data domains (COCO, nocaps).

Key Result Metrics

  • Tag Retrieval: Exact-match F1 scores are typically VV9, but semantic neighborhood F1 rises to fAf_A0 at fAf_A1 and 10k alignment samples; even a single alignment sample produces nontrivial (ROUGE-LfAf_A2) leakage.
  • Text Reconstruction: With fAf_A3, 10k alignments yield ROUGE-LfAf_A4 vs. LLM captions on reference tags, and ROUGE-LfAf_A510–30 vs. human captions.
  • Adaptive Attacks: Scene graph F1 of fAf_A6, object/relation F1 of fAf_A7 against LLM/VLM extraction outputs when using low-fidelity reconstructed images and tag sets.
  • Cross-Domain: BLEU-4 and ROUGE-L degrade moderately in the out-of-domain setting but remain significantly above trivial baselines.

Notably, increasing alignment sample size increases cosine similarity between attack and mapped embeddings and improves all downstream semantic recovery metrics smoothly.

4. Theoretical Insights, Security Implications, and Countermeasures

Semantic leakage persists under severe compression, alignment using as few as one seed sample, and in the absence of decoder access. The core risk emerges from the deliberate optimization of image embeddings for retrieval, which enforces local neighborhood preservation by design. This property enables the recovery of semantic content by "neighborhood hopping" in the aligned space.

Mitigation Directions

  • Semantic-level Differential Privacy: Modifying the embedding distribution to disrupt the correspondence of neighborhoods and private tags, thereby degrading inference without total utility loss.
  • Watermarking/Adversarial Perturbation: Introducing structured, targeted distortions that selectively impair alignment while (ideally) preserving task-relevant semantics.
  • Embedding-space Sanitization: Analogous to recent developments in text embedding privacy, the challenge is to adjust visual embeddings post-hoc or at training time to counter attacks, without erasing all downstream value.

An open problem is formulating a quantitative trade-off between semantic utility and privacy leakage in retrieval-oriented representations.

5. Relation to Broader Semantic Leakage and Dense Representation Risks

The SLImE framework generalizes prior concerns about semantic leakage in visual semantic embedding models for zero-shot learning (ZSL). In that context, semantic leakage referred to label/word-embedding information being inadvertently "baked in" during encoder training, as measured by the mutual information fAf_A8 exceeding zero (Jiao et al., 2021). Recent work has shown that this risk persists even when supervision is ostensibly absent, owing to information-rich geometric alignment between features and distributed word spaces.

Further distinctions appear in settings such as attribute leakage in text-to-image editing, where cross-object correlations and attention bleed-through have prompted sophisticated architectural interventions (e.g., ORE, RGB-CAM, BB) to spatially disentangle semantics (Mun et al., 2024). However, SLImE demonstrates that manifold preservation in compressed embedding spaces—absent explicit label or token access—remains a core privacy vulnerability distinct from instance-level pixel recovery or attribute drift.

6. Open Problems and Future Directions

Current research underscores the challenge of defending against semantic-level inference attacks given the default emphasis on retrieval efficacy and neighborhood geometry. There is no evidence that restricting API access or limiting pixelwise reconstruction is sufficient to impede SLImE-type attacks. The practical and theoretical boundaries of utility-preserving privacy in multimodal embedding spaces remain unresolved. Achieving robust semantic privacy likely requires fundamentally new representation learning paradigms capable of blunting neighborhood preservation relative to sensitive semantic attributes, without compromising retrieval or transfer for permissible tasks (Chen et al., 30 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (3)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Semantic Leakage from Image Embeddings (SLImE).