Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
153 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-supervised learning of Split Invariant Equivariant representations (2302.10283v2)

Published 14 Feb 2023 in cs.CV, cs.AI, and cs.LG

Abstract: Recent progress has been made towards learning invariant or equivariant representations with self-supervised learning. While invariant methods are evaluated on large scale datasets, equivariant ones are evaluated in smaller, more controlled, settings. We aim at bridging the gap between the two in order to learn more diverse representations that are suitable for a wide range of tasks. We start by introducing a dataset called 3DIEBench, consisting of renderings from 3D models over 55 classes and more than 2.5 million images where we have full control on the transformations applied to the objects. We further introduce a predictor architecture based on hypernetworks to learn equivariant representations with no possible collapse to invariance. We introduce SIE (Split Invariant-Equivariant) which combines the hypernetwork-based predictor with representations split in two parts, one invariant, the other equivariant, to learn richer representations. We demonstrate significant performance gains over existing methods on equivariance related tasks from both a qualitative and quantitative point of view. We further analyze our introduced predictor and show how it steers the learned latent space. We hope that both our introduced dataset and approach will enable learning richer representations without supervision in more complex scenarios. Code and data are available at https://github.com/facebookresearch/SIE.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Vicreg: Variance-invariance-covariance regularization for self-supervised learning. arXiv preprint arXiv:2105.04906, 2021.
  2. VICRegl: Self-supervised learning of local visual features. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=ePZsWeGJXyp.
  3. Blender Online Community. Blender - a 3D modelling and rendering package. Blender Foundation, Blender Institute, Amsterdam. URL http://www.blender.org.
  4. Guillotine regularization: Improving deep networks generalization by removing their head. arXiv preprint arXiv:2206.13378, 2022.
  5. Deep clustering for unsupervised learning. In ECCV, 2018.
  6. Unsupervised learning of visual features by contrasting cluster assignments. In NeurIPS, 2020.
  7. Emerging properties in self-supervised vision transformers. In ICCV, 2021.
  8. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012, 2015.
  9. A simple framework for contrastive learning of visual representations. In ICML, pp.  1597–1607. PMLR, 2020a.
  10. Exploring simple siamese representation learning. In CVPR, 2020.
  11. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020b.
  12. An empirical study of training self-supervised vision transformers. In ICCV, 2021.
  13. Group equivariant convolutional networks. In International conference on machine learning, pp. 2990–2999. PMLR, 2016.
  14. Spherical cnns. arXiv preprint arXiv:1801.10130, 2018.
  15. Equivariant contrastive learning. arXiv preprint arXiv:2111.00899, 2021.
  16. Imagenet: A large-scale hierarchical image database. In CVPR, 2009.
  17. Blenderproc. arXiv preprint arXiv:1911.01911, 2019.
  18. Equimod: An equivariance module to improve self-supervised learning. arXiv preprint arXiv:2211.01244, 2022.
  19. Whitening for self-supervised representation learning, 2021.
  20. Explorations in homeomorphic variational auto-encoding. arXiv preprint arXiv:1807.04689, 2018.
  21. On the duality between contrastive and non-contrastive self-supervised learning. arXiv preprint arXiv:2206.02574, 2022.
  22. Unsupervised representation learning by predicting image rotations. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=S1v4N2l0-.
  23. Bootstrap your own latent: A new approach to self-supervised learning. In NeurIPS, 2020.
  24. Hypernetworks. arXiv preprint arXiv:1609.09106, 2016.
  25. Provable guarantees for self-supervised deep learning with spectral contrastive loss. NeurIPS, 34, 2021.
  26. Deep residual learning for image recognition. In CVPR, 2016.
  27. Momentum contrast for unsupervised visual representation learning. In CVPR, 2020.
  28. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  16000–16009, 2022.
  29. Transforming auto-encoders. In International conference on artificial neural networks, pp.  44–51. Springer, 2011.
  30. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  31. Contrastive learning of structured world models. arXiv preprint arXiv:1911.12247, 2019.
  32. Learning multiple layers of features from tiny images. 2009.
  33. Improving Transferability of Representations via Augmentation-Aware Self-Supervision, November 2021a. URL http://arxiv.org/abs/2111.09613. arXiv:2111.09613 [cs].
  34. Compressive visual representations. In NeurIPS, 2021b.
  35. Efficient self-supervised vision transformers for representation learning. In ICLR, 2022a.
  36. Neural manifold clustering and embedding. arXiv preprint arXiv:2201.10000, 2022b.
  37. View-dependent object recognition by monkeys. Current biology, 4(5):401–414, 1994.
  38. Equivariant representation learning via class-pose decomposition. arXiv preprint arXiv:2207.03116, 2022.
  39. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018.
  40. Learning Symmetric Embeddings for Equivariant World Models, June 2022. URL http://arxiv.org/abs/2204.11371. arXiv:2204.11371 [cs].
  41. Self-supervised learning through efference copies. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=DotEQCtY67g.
  42. Structuring representations using group invariants. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=vWUmBjin_-o.
  43. Pushing the limits of self-supervised resnets: Can we outperform supervised learning without labels on imagenet? arXiv preprint arXiv:2201.05119, 2022.
  44. Trimble Inc. 3d warehouse. https://3dwarehouse.sketchup.com/. Accessed: 2023-01-01.
  45. Self-Supervised Learning with Data Augmentations Provably Isolates Content from Style. arXiv:2106.04619 [cs, stat], June 2021. URL http://arxiv.org/abs/2106.04619. arXiv: 2106.04619.
  46. Unsupervised learning of group invariant and equivariant representations. In Oh, A. H., Agarwal, A., Belgrave, D., and Cho, K. (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=47lpv23LDPr.
  47. What Should Not Be Contrastive in Contrastive Learning, March 2021. URL http://arxiv.org/abs/2008.05659. Number: arXiv:2008.05659 arXiv:2008.05659 [cs].
  48. What should be equivariant in self-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  4111–4120, 2022.
  49. Decoupled contrastive learning. arXiv preprint arXiv:2110.06848, 2021.
  50. Barlow twins: Self-supervised learning via redundancy reduction. In ICML, pp.  12310–12320. PMLR, 2021.
  51. ibot: Image bert pre-training with online tokenizer. In ICLR, 2022a.
  52. Mugs: A multi-granular self-supervised learning framework. 2022b.
  53. Contrastive learning inverts the data generating process. In International Conference on Machine Learning, pp. 12979–12990. PMLR, 2021.
Citations (25)

Summary

We haven't generated a summary for this paper yet.