Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning (2304.13850v3)
Abstract: Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another. However, when taken to the extreme, SSL models can unintendedly memorize specific parts in individual training samples rather than learning semantically meaningful associations. In this work, we perform a systematic study of the unintended memorization of image-specific information in SSL models -- which we refer to as d\'ej`a vu memorization. Concretely, we show that given the trained model and a crop of a training image containing only the background (e.g., water, sky, grass), it is possible to infer the foreground object with high accuracy or even visually reconstruct it. Furthermore, we show that d\'ej`a vu memorization is common to different SSL algorithms, is exacerbated by certain design choices, and cannot be detected by conventional techniques for evaluating representation quality. Our study of d\'ej`a vu memorization reveals previously unknown privacy risks in SSL models, as well as suggests potential practical mitigation strategies. Code is available at https://github.com/facebookresearch/DejaVu.
- Reconstructing training data with informed adversaries. arXiv preprint arXiv:2201.04845, 2022.
- Vicreg: Variance-invariance-covariance regularization for self-supervised learning. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL https://openreview.net/forum?id=xm6YD62D1Ub.
- Guillotine regularization: Improving deep networks generalization by removing their head, 2022a. URL https://arxiv.org/abs/2206.13378.
- High fidelity visualization of what your self-supervised representation knows about. Transactions on Machine Learning Research, 2022b. URL https://openreview.net/forum?id=urfWb7VjmL.
- Towards democratizing joint-embedding self-supervised learning, 2023. URL https://arxiv.org/abs/2303.01986.
- The secret sharer: Evaluating and testing unintended memorization in neural networks. In 28th USENIX Security Symposium (USENIX Security 19), pp. 267–284, 2019.
- Extracting training data from large language models. In 30th USENIX Security Symposium (USENIX Security 21), pp. 2633–2650, 2021.
- Extracting training data from diffusion models. arXiv preprint arXiv:2301.13188, 2023.
- Unsupervised learning of visual features by contrasting cluster assignments. In NeurIPS, 2020.
- Emerging properties in self-supervised vision transformers. In ICCV, 2021.
- A simple framework for contrastive learning of visual representations. In ICML, 2020.
- Exploring simple siamese representation learning. In CVPR, 2020.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pp. 248–255. Ieee, 2009.
- The algorithmic foundations of differential privacy. Theoretical Computer Science, 9(3-4):211–407, 2013.
- Calibrating noise to sensitivity in private data analysis. In Theory of cryptography, pp. 265–284. Springer, 2006.
- Feldman, V. Does learning require memorization? a short tale about a long tail. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, pp. 954–959, 2020.
- Privacy in pharmacogenetics: An {{\{{End-to-End}}\}} case study of personalized warfarin dosing. In 23rd USENIX Security Symposium (USENIX Security 14), pp. 17–32, 2014.
- Bootstrap your own latent: A new approach to self-supervised learning. In NeurIPS, 2020.
- Bounding training data reconstruction in private (deep) learning. arXiv preprint arXiv:2201.12383, 2022.
- Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 16000–16009, June 2022.
- Are attribute inference attacks just imputation? arXiv preprint arXiv:2209.01292, 2022.
- Are your sensitive attributes private? novel model inversion attribute inference attacks on classification models. arXiv preprint arXiv:2201.09370, 2022.
- PyTorch: An Imperative Style, High-Performance Deep Learning Library. Curran Associates Inc., Red Hook, NY, USA, 2019.
- White-box vs black-box: Bayes optimal strategies for membership inference. In International Conference on Machine Learning, pp. 5558–5567. PMLR, 2019.
- Ml-leaks: Model and data independent membership inference attacks and defenses on machine learning models. arXiv preprint arXiv:1806.01246, 2018.
- Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pp. 3–18. IEEE, 2017.
- Deep image prior. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9446–9454, 2018.
- On the importance of difficulty calibration in membership inference attacks. arXiv preprint arXiv:2111.08440, 2021.
- Enhanced membership inference attacks against machine learning models. arXiv preprint arXiv:2111.09679, 2021.
- Privacy risk in machine learning: Analyzing the connection to overfitting. In 2018 IEEE 31st computer security foundations symposium (CSF), pp. 268–282. IEEE, 2018.
- Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888, 2017.
- Generative image inpainting with contextual attention. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5505–5514, 2018.
- Barlow twins: Self-supervised learning via redundancy reduction. arXiv preprint arxiv:2103.03230, 2021.
- Simclrt: A simple framework for contrastive learning of rumor tracking. Eng. Appl. Artif. Intell., 110:104757, 2022. doi: 10.1016/j.engappai.2022.104757. URL https://doi.org/10.1016/j.engappai.2022.104757.