Manifold Contrastive Learning with Variational Lie Group Operators (2306.13544v1)
Abstract: Self-supervised learning of deep neural networks has become a prevalent paradigm for learning representations that transfer to a variety of downstream tasks. Similar to proposed models of the ventral stream of biological vision, it is observed that these networks lead to a separation of category manifolds in the representations of the penultimate layer. Although this observation matches the manifold hypothesis of representation learning, current self-supervised approaches are limited in their ability to explicitly model this manifold. Indeed, current approaches often only apply augmentations from a pre-specified set of "positive pairs" during learning. In this work, we propose a contrastive learning approach that directly models the latent manifold using Lie group operators parameterized by coefficients with a sparsity-promoting prior. A variational distribution over these coefficients provides a generative model of the manifold, with samples which provide feature augmentations applicable both during contrastive training and downstream tasks. Additionally, learned coefficient distributions provide a quantification of which transformations are most likely at each point on the manifold while preserving identity. We demonstrate benefits in self-supervised benchmarks for image datasets, as well as a downstream semi-supervised task. In the former case, we demonstrate that the proposed methods can effectively apply manifold feature augmentations and improve learning both with and without a projection head. In the latter case, we demonstrate that feature augmentations sampled from learned Lie group operators can improve classification performance when using few labels.
- Vicreg: Variance-invariance-covariance regularization for self-supervised learning, 2022.
- Sparse-Coding Variational Auto-Encoders. preprint, Neuroscience, August 2018. URL http://biorxiv.org/lookup/doi/10.1101/399246.
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems. SIAM Journal on Imaging Sciences, 2(1):183–202, January 2009. ISSN 1936-4954. doi: 10.1137/080716542. URL http://epubs.siam.org/doi/10.1137/080716542.
- Laplacian eigenmaps and spectral techniques for embedding and clustering. In Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic, NIPS’01, page 585–591, Cambridge, MA, USA, 2001. MIT Press.
- Using manifold stucture for partially labeled classification. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2002. URL https://proceedings.neurips.cc/paper/2002/file/f976b57bb9dd27aa2e7e7df2825893a6-Paper.pdf.
- On manifold regularization. In Robert G. Cowell and Zoubin Ghahramani, editors, Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, volume R5 of Proceedings of Machine Learning Research, pages 17–24. PMLR, 06–08 Jan 2005. URL https://proceedings.mlr.press/r5/belkin05a.html. Reissued by PMLR on 30 March 2021.
- Representation Learning: A Review and New Perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798–1828, August 2013. ISSN 0162-8828, 2160-9292. doi: 10.1109/TPAMI.2013.50. URL http://ieeexplore.ieee.org/document/6472238/.
- Non-Local Manifold Tangent Learning. In L. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems, volume 17. MIT Press, 2004. URL https://proceedings.neurips.cc/paper/2004/file/0b7e926154c1274e8b602ff0d7c133d7-Paper.pdf.
- Accurate and diverse sampling of sequences based on a "best of many" sample objective, 2018.
- Addressing the topological defects of disentanglement via distributed operators, 2021.
- Emerging properties in self-supervised vision transformers, 2021.
- A Simple Framework for Contrastive Learning of Visual Representations. arXiv:2002.05709 [cs, stat], June 2020a. URL http://arxiv.org/abs/2002.05709. arXiv: 2002.05709.
- Big self-supervised models are strong semi-supervised learners, 2020b. URL https://arxiv.org/abs/2006.10029.
- Exploring simple siamese representation learning, 2020. URL https://arxiv.org/abs/2011.10566.
- Improved baselines with momentum contrastive learning, 2020c. URL https://arxiv.org/abs/2003.04297.
- The sparse manifold transform, 2018.
- Minimalistic unsupervised learning with the sparse manifold transform, 2022.
- An analysis of single-layer networks in unsupervised feature learning. In Geoffrey Gordon, David Dunson, and Miroslav Dudík, editors, Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, volume 15 of Proceedings of Machine Learning Research, pages 215–223, Fort Lauderdale, FL, USA, 11–13 Apr 2011. PMLR. URL https://proceedings.mlr.press/v15/coates11a.html.
- Group Equivariant Convolutional Networks. page 10.
- Representing Closed Transformation Paths in Encoded Network Latent Space. arXiv:1912.02644 [cs, stat], December 2019. URL http://arxiv.org/abs/1912.02644. arXiv: 1912.02644.
- Learning Identity-Preserving Transformations on Data Manifolds. arXiv:2106.12096 [cs, stat], June 2021. URL http://arxiv.org/abs/2106.12096. arXiv: 2106.12096.
- Variational Autoencoder with Learned Latent Structure. arXiv:2006.10597 [cs, stat], June 2020. URL http://arxiv.org/abs/2006.10597. arXiv: 2006.10597.
- Deep autoencoders: From understanding to generalization guarantees. 2020. doi: 10.48550/ARXIV.2009.09525. URL https://arxiv.org/abs/2009.09525.
- Toward a Geometrical Understanding of Self-supervised Contrastive Learning, May 2022. URL http://arxiv.org/abs/2205.06926. arXiv:2205.06926 [cs].
- Randaugment: Practical automated data augmentation with a reduced search space, 2019.
- Learning transport operators for image manifolds. In Y. Bengio, D. Schuurmans, J. D. Lafferty, C. K. I. Williams, and A. Culotta, editors, Advances in Neural Information Processing Systems 22, pages 423–431. Curran Associates, Inc., 2009. URL http://papers.nips.cc/paper/3791-learning-transport-operators-for-image-manifolds.pdf.
- ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, June 2009. doi: 10.1109/CVPR.2009.5206848. ISSN: 1063-6919.
- Untangling invariant object recognition. Trends in Cognitive Sciences, 11(8):333–341, August 2007. ISSN 1364-6613. doi: 10.1016/j.tics.2007.06.010.
- How Does the Brain Solve Visual Object Recognition? Neuron, 73(3):415–434, February 2012. ISSN 0896-6273. doi: 10.1016/j.neuron.2012.01.010.
- With a little help from my friends: Nearest-neighbor contrastive learning of visual representations, 2021. URL https://arxiv.org/abs/2104.14548.
- Whitening for self-supervised representation learning, 2020. URL https://arxiv.org/abs/2007.06346.
- Variational sparse coding with learned thresholding. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 6034–6058. PMLR, 17–23 Jul 2022. URL https://proceedings.mlr.press/v162/fallah22a.html.
- Testing the manifold hypothesis. Journal of the American Mathematical Society, 29(4):983–1049, February 2016. ISSN 0894-0347, 1088-6834. doi: 10.1090/jams/852. URL https://www.ams.org/jams/2016-29-04/S0894-0347-2016-00852-4/.
- Bootstrap your own latent: A new approach to self-supervised learning, 2020. URL https://arxiv.org/abs/2006.07733.
- Dimensionality Reduction by Learning an Invariant Mapping. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR’06), volume 2, pages 1735–1742, New York, NY, USA, 2006. IEEE. ISBN 978-0-7695-2597-6. doi: 10.1109/CVPR.2006.100. URL http://ieeexplore.ieee.org/document/1640964/.
- Brian C. Hall. Lie Groups, Lie Algebras, and Representations: An Elementary Introduction, volume 222 of Graduate Texts in Mathematics. Springer International Publishing, Cham, 2015. ISBN 978-3-319-13466-6 978-3-319-13467-3. doi: 10.1007/978-3-319-13467-3. URL https://link.springer.com/10.1007/978-3-319-13467-3.
- Deep Residual Learning for Image Recognition. arXiv:1512.03385 [cs], December 2015. URL http://arxiv.org/abs/1512.03385. arXiv: 1512.03385.
- Momentum Contrast for Unsupervised Visual Representation Learning. arXiv:1911.05722 [cs], March 2020. URL http://arxiv.org/abs/1911.05722. arXiv: 1911.05722.
- Beta-VAE: Learning Basic Visual Concepts With a Constrained Variational Framework. page 22, 2017.
- Robust self-supervised learning with lie groups, 2022. URL https://arxiv.org/abs/2210.13356.
- Excessive invariance causes adversarial vulnerability, 2020.
- Understanding Dimensional Collapse in Contrastive Self-supervised Learning, April 2022. URL http://arxiv.org/abs/2110.09348. arXiv:2110.09348 [cs].
- Auto-Encoding Variational Bayes. arXiv:1312.6114 [cs, stat], May 2014. URL http://arxiv.org/abs/1312.6114. arXiv: 1312.6114.
- Alex Krizhevsky. Learning multiple layers of features from tiny images. pages 32–33, 2009. URL https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
- Featmatch: Feature-based augmentation for semi-supervised learning, 2020.
- Dong-Hyun Lee. Pseudo-label : The simple and efficient semi-supervised learning method for deep neural networks. ICML 2013 Workshop : Challenges in Representation Learning (WREPL), 07 2013.
- Decoupled weight decay regularization, 2019.
- Simple Learned Weighted Sums of Inferior Temporal Neuronal Firing Rates Accurately Predict Human Core Object Recognition Performance. Journal of Neuroscience, 35(39):13402–13418, September 2015. ISSN 0270-6474, 1529-2401. doi: 10.1523/JNEUROSCI.5181-14.2015.
- Learning the Lie Groups of Visual Invariance. Neural Computation, 19(10):2665–2693, October 2007. ISSN 0899-7667, 1530-888X. doi: 10.1162/neco.2007.19.10.2665. URL https://direct.mit.edu/neco/article/19/10/2665-2693/7221.
- Liegg: Studying learned lie group generators, 2023.
- Sample Complexity of Testing the Manifold Hypothesis. page 9.
- Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607–609, June 1996. ISSN 1476-4687. doi: 10.1038/381607a0. URL https://www.nature.com/articles/381607a0. Number: 6583 Publisher: Nature Publishing Group.
- Representation Learning with Contrastive Predictive Coding, January 2019. URL http://arxiv.org/abs/1807.03748. arXiv:1807.03748 [cs, stat].
- Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. URL http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf.
- The Intrinsic Dimension of Images and Its Impact on Learning, April 2021. URL http://arxiv.org/abs/2104.08894. arXiv:2104.08894 [cs, stat].
- Learning Lie Groups for Invariant Visual Perception. In M. Kearns, S. Solla, and D. Cohn, editors, Advances in Neural Information Processing Systems, volume 11. MIT Press, 1998. URL https://proceedings.neurips.cc/paper/1998/file/277281aada22045c03945dcb2ca6f2ec-Paper.pdf.
- Stochastic Backpropagation and Approximate Inference in Deep Generative Models. arXiv:1401.4082 [cs, stat], May 2014. URL http://arxiv.org/abs/1401.4082. arXiv: 1401.4082.
- The Manifold Tangent Classifier. In Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011. URL https://papers.nips.cc/paper/2011/hash/d1f44e2f09dc172978a4d3151d11d63e-Abstract.html.
- The effective rank: A measure of effective dimensionality. In 2007 15th European Signal Processing Conference, pages 606–610, 2007.
- An Introduction to Locally Linear Embedding. page 13.
- An Unsupervised Algorithm For Learning Lie Group Transformations. arXiv:1001.1027 [cs], June 2017. URL http://arxiv.org/abs/1001.1027. arXiv: 1001.1027.
- FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence. page 21.
- Learning Structured Output Representation using Deep Conditional Generative Models. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015. URL https://proceedings.neurips.cc/paper/2015/file/8d55a249e6baa5c06772297520da2051-Paper.pdf.
- J. B. Tenenbaum. A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290(5500):2319–2323, December 2000. ISSN 00368075, 10959203. doi: 10.1126/science.290.5500.2319. URL https://www.sciencemag.org/lookup/doi/10.1126/science.290.5500.2319.
- Variational Sparse Coding. In Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, pages 690–700. PMLR, August 2020. URL https://proceedings.mlr.press/v115/tonolini20a.html. ISSN: 2640-3498.
- Manifold mixup: Better representations by interpolating hidden states, 2019a.
- Interpolation Consistency Training for Semi-Supervised Learning. arXiv:1903.03825 [cs, stat], May 2019b. URL http://arxiv.org/abs/1903.03825. arXiv: 1903.03825.
- Learning efficient coding of natural images with maximum manifold capacity representations, 2023.