Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Structuring Representation Geometry with Rotationally Equivariant Contrastive Learning (2306.13924v1)

Published 24 Jun 2023 in cs.LG and cs.CV

Abstract: Self-supervised learning converts raw perceptual data such as images to a compact space where simple Euclidean distances measure meaningful variations in data. In this paper, we extend this formulation by adding additional geometric structure to the embedding space by enforcing transformations of input space to correspond to simple (i.e., linear) transformations of embedding space. Specifically, in the contrastive learning setting, we introduce an equivariance objective and theoretically prove that its minima forces augmentations on input space to correspond to rotations on the spherical embedding space. We show that merely combining our equivariant loss with a non-collapse term results in non-trivial representations, without requiring invariance to data augmentations. Optimal performance is achieved by also encouraging approximate invariance, where input augmentations correspond to small rotations. Our method, CARE: Contrastive Augmentation-induced Rotational Equivariance, leads to improved performance on downstream tasks, and ensures sensitivity in embedding space to important variations in data (e.g., color) that standard contrastive methods do not achieve. Code is available at https://github.com/Sharut/CARE.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. G. Alain and Y. Bengio. Understanding intermediate layers using linear classifier probes. In International Conference on Learning Representations (ICLR), 2017.
  2. Steerable equivariant representation learning. preprint arXiv:2302.11349, 2023.
  3. B. Blum-Smith and S. Villar. Equivariant maps from invariant functions. preprint arXiv:2209.14991, 2022.
  4. Translating embeddings for modeling multi-relational data. In Advances in Neural Information Processing Systems (NeurIPS), volume 26, 2013.
  5. Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. preprint arXiv:2104.13478, 2021.
  6. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning (ICML), pages 1597–1607. PMLR, 2020.
  7. X. Chen and K. He. Exploring simple siamese representation learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 15750–15758, 2021.
  8. Debiased contrastive learning. In Advances in Neural Information Processing Systems (NeurIPS), pages 8765–8775, 2020.
  9. A toy model of universality: Reverse engineering how networks learn group operations. In ICLR Workshop on Physics for Machine Learning, 2023.
  10. T. Cohen and M. Welling. Group equivariant convolutional networks. In International Conference on Machine Learning (ICML), pages 2990–2999. PMLR, 2016.
  11. C. W. Curtis and I. Reiner. Representation theory of finite groups and associative algebras, volume 356. American Mathematical Soc., 1966.
  12. Equivariant contrastive learning. In International Conference on Learning Representations (ICLR), 2022.
  13. Hyperbolic image-text representations. ICLR Workshop on Multimodal Representation Learning, 2023.
  14. A. Devillers and M. Lefort. Equimod: An equivariance module to improve self-supervised learning. In International Conference on Learning Representations (ICLR), 2023.
  15. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations (ICLR), 2021.
  16. Self-supervised learning of split invariant equivariant representations. preprint arXiv:2302.10283, 2023.
  17. Hyperbolic contrastive learning for visual representations beyond objects. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  18. M. Gutmann and A. Hyvärinen. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings International Conference on Artificial Intelligence and Statistics (AISTATS), pages 297–304, 2010.
  19. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
  20. Understanding dimensional collapse in contrastive self-supervised learning. In International Conference on Learning Representations (ICLR), 2022.
  21. Equivariance with learned canonicalization functions. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  22. W. Kabsch. A solution for the best rotation to relate two sets of vectors. Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography, 32(5):922–923, 1976.
  23. G. R. Kneller. Superposition of molecular structures using quaternions. Molecular Simulation, 7(1-2):113–119, 1991.
  24. Improving transferability of representations via augmentation-aware self-supervision. In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 17710–17722, 2021.
  25. On the principles of parsimony and self-consistency for the emergence of intelligence. Frontiers of Information Technology & Electronic Engineering, 23(9):1298–1323, 2022.
  26. Fundamentals of spacecraft attitude determination and control, volume 1286. Springer, 2014.
  27. M.-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pages 722–729. IEEE, 2008.
  28. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems (NeurIPS), volume 32, 2019.
  29. F. Peter and H. Weyl. Die Vollständigkeit der primitiven Darstellungen einer geschlossenen kontinuierlichen Gruppe. Mathematische Annalen, 97(1):737–755, 1927.
  30. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning (ICML), pages 8748–8763. PMLR, 2021.
  31. Contrastive learning with hard negative samples. In International Conference on Learning Representations (ICLR), 2021a.
  32. Can contrastive learning avoid shortcut solutions? In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pages 4974–4986, 2021b.
  33. B. J. Schmid. Finite groups and invariant theory. In Topics in Invariant Theory: Séminaire d’Algèbre P. Dubreil et M.-P. Malliavin 1989–1990 (40ème Année), pages 35–66. Springer, 2006.
  34. Wav2vec 2.0: A framework for self-supervised learning of speech representations. In Proceedings of the 2021 Conference of the International Speech Communication Association (INTERSPEECH), pages 1657–1661, 2021.
  35. J.-P. Serre et al. Linear representations of finite groups, volume 42. Springer, 1977.
  36. Structuring representations using group invariants. In Advances in Neural Information Processing Systems (NeurIPS), 2022.
  37. Duet: 2d structured and approximately equivariant representations. 2023.
  38. RotatE: Knowledge graph embedding by relational rotation in complex space. In International Conference on Learning Representations (ICLR), 2019.
  39. Representation learning with contrastive predictive coding. preprint arXiv:1807.03748, 2018.
  40. Scalars are universal: Equivariant machine learning, structured like classical physics. pages 28848–28863, 2021.
  41. T. Wang and P. Isola. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning (ICML), pages 9929–9939. PMLR, 2020.
  42. H. Weyl. The classical groups: their invariants and representations. Princeton university press, 1946.
  43. What should not be contrastive in contrastive learning. In International Conference on Learning Representations (ICLR), 2021.
  44. What should be equivariant in self-supervised learning. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4111–4120, 2022.
  45. How neural networks extrapolate: From feedforward to graph neural networks. In International Conference on Learning Representations (ICLR), 2021.
  46. Deep bidirectional language-knowledge graph pretraining. In Advances in Neural Information Processing Systems (NeurIPS), volume 35, pages 37309–37323, 2022.
  47. Hyperbolic contrastive learning. preprint arXiv:2302.01409, 2023.
Citations (12)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub