Variational Self-Supervised Contrastive Learning Using Beta Divergence (2312.00824v3)
Abstract: Learning a discriminative semantic space using unlabelled and noisy data remains unaddressed in a multi-label setting. We present a contrastive self-supervised learning method which is robust to data noise, grounded in the domain of variational methods. The method (VCL) utilizes variational contrastive learning with beta-divergence to learn robustly from unlabelled datasets, including uncurated and noisy datasets. We demonstrate the effectiveness of the proposed method through rigorous experiments including linear evaluation and fine-tuning scenarios with multi-label datasets in the face understanding domain. In almost all tested scenarios, VCL surpasses the performance of state-of-the-art self-supervised methods, achieving a noteworthy increase in accuracy.
- Divide and contrast: Self-supervised learning from uncurated data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10063–10074, 2021.
- Self-supervised pretraining of visual features in the wild. arXiv preprint arXiv:2103.01988, 2021.
- Is self-supervised learning more robust than supervised learning? arXiv preprint arXiv:2206.05259, 2022.
- When does contrastive visual representation learning work? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14755–14764, 2022.
- Webvision database: Visual learning and understanding from web data. arXiv preprint arXiv:1708.02862, 2017.
- Yfcc-celeba face attributes datasets. In 2021 29th Signal Processing and Communications Applications Conference (SIU), pages 1–4, 2021.
- Improvements to context based self-supervised learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9339–9348, 2018.
- Boosting self-supervised learning via knowledge transfer. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9359–9367, 2018.
- Colorization as a proxy task for visual understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6874–6883, 2017.
- Colorful image colorization. In European conference on computer vision, pages 649–666. Springer, 2016.
- Unsupervised learning by predicting noise. In International Conference on Machine Learning, pages 517–526. PMLR, 2017.
- Representation learning by learning to count. In Proceedings of the IEEE International Conference on Computer Vision, pages 5898–5906, 2017.
- Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2536–2544, 2016.
- Self-supervised feature learning by learning to spot artifacts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2733–2742, 2018.
- Cross-domain self-supervised multi-task feature learning using synthetic imagery. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 762–771, 2018.
- Representation learning with contrastive predictive coding. arXiv e-prints, pages arXiv–1807, 2018.
- Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3733–3742, 2018.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020.
- Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems, 33:21271–21284, 2020.
- With a little help from my friends: Nearest-neighbor contrastive learning of visual representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9588–9597, 2021.
- Vicreg: Variance-invariance-covariance regularization for self-supervised learning. arXiv preprint arXiv:2105.04906, 2021.
- Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning, pages 12310–12320. PMLR, 2021.
- An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9640–9649, 2021.
- Tico: Transformation invariance and covariance contrast for self-supervised visual representation learning. arXiv preprint arXiv:2206.10698, 2022.
- VCL-PL: semi-supervised learning from noisy web data with variational contrastive learning. In 2022 26th International Conference on Pattern Recognition (ICPR), pages 740–747. IEEE, 2022.
- Contrast to divide: Self-supervised pre-training for learning with noisy labels. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1657–1667, 2022.
- Learning with neighbor consistency for noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4672–4681, 2022.
- Cnll: A semi-supervised approach for continual noisy label learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3878–3888, 2022.
- Bootstrapping the relationship between images and their clean and noisy labels. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 5344–5354, 2023.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- A robust variational autoencoder using beta divergence. Knowledge-Based Systems, 238:107886, 2022.
- Stephen G. Odaibo. Tutorial: Deriving the standard variational autoencoder (vae) loss function. ArXiv, abs/1907.08956, 2019.
- Robust extraction of local structures by the minimum β𝛽\betaitalic_β-divergence method. Neural Networks, 23(2):226–238, 2010.
- Exploring latent structure of mixture ica models by the minimum β𝛽\betaitalic_β-divergence method. Neural Computation, 18(1):166–190, 2006.
- Raul Kompass. A generalized divergence measure for nonnegative matrix factorization. Neural computation, 19(3):780–791, 2007.
- Sara Atito Ali Ahmed and Berrin Yanikoglu. Relative attribute classification with deep-ranksvm. In Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, January 10–15, 2021, Proceedings, Part II, pages 659–671. Springer, 2021.
- Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pages 3730–3738, 2015.
- Cnn features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 806–813, 2014.
- Exploiting relationship between attributes for improved face verification. Computer Vision and Image Understanding, 122:143–154, 2014.
- Multi-view perceptron: a deep model for learning face identity and view representations. Advances in neural information processing systems, 27, 2014.
- Are facial attributes adversarially robust? In 2016 23rd International Conference on Pattern Recognition (ICPR), pages 3121–3127. IEEE, 2016.
- Face attribute prediction using off-the-shelf cnn features. In 2016 International Conference on Biometrics (ICB), pages 1–7. IEEE, 2016.
- Multi-label networks for face attributes classification. In 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pages 1–6. IEEE, 2018.
- Self-supervised learning of face representations for video face clustering. In 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), pages 1–8, 2019.
- Revisiting self-supervised contrastive learning for facial expression recognition, 2022.
- Self-supervised learning of a facial attribute embedding from video, 2018.
- Scatsimclr: self-supervised contrastive learning with pretext task regularization for small-scale datasets. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1098–1106, 2021.
- Families of alpha- beta- and gamma- divergences: Flexible and robust measures of similarities. Entropy, 12(6):1532–1568, 2010.
- Robust and efficient estimation by minimising a density power divergence. Biometrika, 85(3):549–559, 1998.