Déjà Vu in SSL: Memorization & Privacy Risks
- Déjà Vu is defined as image-specific memorization in SSL, where models retain unique details allowing reconstruction beyond general correlations.
- Experiments reveal that SSL models, especially with VICReg, can achieve high memorization scores (e.g., DV rates up to 33%), highlighting significant privacy vulnerabilities.
- Mitigation strategies such as hyperparameter tuning, early stopping, and guillotine regularization are crucial to reduce memory leaks and protect sensitive information.
Déjà Vu refers to a measurable and consequential phenomenon in self-supervised learning (SSL) where neural networks memorize image-specific details instead of learning purely semantic or distributional associations. Contrasting with classical memorization or correlation, Déjà Vu memorization is characterized by the model embedding sufficient information about individual training images that allows the recovery of unique aspects of those images—even from input regions that do not visually contain those aspects. This effect, not detected by conventional SSL evaluation metrics, raises new privacy risks and necessitates more comprehensive evaluation and mitigation protocols (Meehan et al., 2023).
1. Formal Definition and Operationalization
Déjà Vu memorization in SSL models is defined as the retention of information so specific to an individual training image that, given only a crop containing the background, a model can accurately infer or reconstruct foreground content purely due to exposure to that particular image during training, beyond what would be inferred from statistical correlation alone.
Given a distribution over images , let quantify the base-rate probability that background patch implies foreground class . Déjà Vu memorization is present when a model trained on set can predict from substantially better than a reference model (not exposed to ), indicating information retained is not from generalization but from sample-specific memorization.
To quantify this, models SSL and SSL are trained with identical protocols on disjoint sets, and . For each crop from , embeddings are used to perform -nearest neighbor (KNN) retrieval from a large public set , and success is judged if the correct class label is recovered.
Key metric:
where, at a chosen confidence percentile (typically 20%), is the KNN accuracy on the % most confident according to SSL, and is accuracy of SSL on the same images and protocol. The score thus isolates the portion of success due to memorization, controlling for natural correlation structure.
2. Experimental Characterization
2.1 Experimental Setup
- ImageNet-1K (face-blurred), ~456K annotated images.
- Dataset splits: (target, 75K–150K), (reference, disjoint, balanced), (500K public set, no box overlap).
- Backbones: ResNet (18/50/101), ViT (Tiny/Small/Base/Large).
- SSL objectives: SimCLR (InfoNCE loss), VICReg, Barlow Twins, BYOL, DINO, and a supervised baseline.
2.2 Label Inference and Memorization Scores
Models were trained for up to 1,000 epochs. On the VICReg criterion, for the top 1% most confident crops, while , yielding . SimCLR and BYOL showed substantially less memorization ( of 10% and 8%, respectively), with supervised baselines around 9%.
2.3 Visual Reconstructions
A reconstructive conditional diffusion model (RCDM) was trained on public-set embeddings to invert backbone representations. Feeding only background crops of through SSL and decoding with RCDM enabled high-fidelity recovery of object regions uniquely characteristic of specific training samples (e.g., a particular black swan, specific dam, or aircraft carrier variant)—an ability not present in SSL. This establishes that memorization in SSL can encode fine-grained, image-level details.
2.4 Sample-Level Analysis
Each example is classified as:
- Memorized: KNN correct by SSL only
- Correlated: KNN correct by both
- Misrepresented: KNN correct by SSL only
- Unassociated: KNN incorrect by both
As training proceeds (up to 1,000 epochs), the memorized fraction for VICReg approaches 15%, whereas the supervised baseline remains near 4%.
3. Impact of Design Choices
3.1 Training Regimen and Model Scale
- Longer training (epochs) sharply increases memorization, even as linear-probe accuracy on downstream tasks changes minimally.
- Larger model capacity (ResNet/ViT size) exacerbates memorization: DV grows with parameter count.
- Larger datasets do not necessarily diminish memorization: DV is stable from 100K to 500K training items, while linear-probe gap halves.
3.2 Algorithm and Hyperparameters
- VICReg's invariance weight and SimCLR's temperature show tight "memorization bands": increasing/decreasing these out of a narrow range collapses DV to near-baseline, often with less than 2% impact on probe accuracy.
- Barlow Twins and DINO also leak, but with different profiles (Barlow Twins: ).
3.3 Architectural Choices
- Guillotine regularization (removing the projection head and releasing only backbone features) reduces DV nearly to zero for VICReg.
- More shallow projectors (e.g., SimCLR) still can leak via the backbone.
- Fine-tuning briefly reduces DV but then it regrows with continued optimization.
4. Privacy Risks and Mitigations
4.1 Identified Risks
- Non-trivial risk of foreground recognition or even π-level visual reconstruction by adversaries with only background regions of training images.
- Sub-class private attributes (such as species or object pose), not just coarse class, are at risk.
- Standard SSL quality metrics and linear-probe performance do not correlate with memorization risk.
- Membership and attribute inference attacks are greatly enabled for memorized images.
4.2 Mitigation Strategies
- Hyperparameter tuning: Move loss/invariance weights or temperature out of high-memorization regions.
- Early stopping: Avoid extensive training into high-DV regimens.
- Guillotine regularization: Release only shallow backbone representations or truncate projector depth.
- Formal DP training: Use differentially private optimization or noise injection.
- Monitoring: Periodically run A/B/KNN tests to detect memorization during model development.
5. Code and Reproducibility
All scripts, splits, and evaluation pipelines are open-source (https://github.com/facebookresearch/DejaVu). The evaluation pipeline depends on PyTorch/FFCV-SSL, provides routines for SSL model training, KNN evaluation, and reconstruction. Recommendations include regular monitoring via DV score, aggressive parameter tuning for privacy, and prudent release of feature backbones rather than full projectors or features.
6. Broader Implications and Recommendations
Déjà Vu memorization reveals a fundamental contradiction in SSL: striving for contrastive invariance or pulling together augmentations can result in the network encoding idiosyncratic image details, leading to sample-specific overfitting beyond known risks from supervised learning. Empirically, strong SSL objectives (VICReg, Barlow Twins) expose tens of percent of training samples to potential leakage, far above many supervised baselines, despite similar downstream benchmark performance.
For SSL practitioners in privacy-critical domains, it is essential to:
- Treat linear-probe accuracy and standard test-set metrics as insufficient for privacy guarantees.
- Incorporate explicit Déjà Vu memorization diagnostics (target/reference KNN) in validation.
- Take a conservative approach to model and parameter selection, prioritizing both utility and privacy.
Déjà Vu analysis thus informs a necessary extension of quality assurance in SSL, highlighting new axes of model evaluation and procedure for safe deployment in sensitive applications (Meehan et al., 2023).