Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-Supervised Representation Learning with Meta Comprehensive Regularization (2403.01549v1)

Published 3 Mar 2024 in cs.CV

Abstract: Self-Supervised Learning (SSL) methods harness the concept of semantic invariance by utilizing data augmentation strategies to produce similar representations for different deformations of the same input. Essentially, the model captures the shared information among multiple augmented views of samples, while disregarding the non-shared information that may be beneficial for downstream tasks. To address this issue, we introduce a module called CompMod with Meta Comprehensive Regularization (MCR), embedded into existing self-supervised frameworks, to make the learned representations more comprehensive. Specifically, we update our proposed model through a bi-level optimization mechanism, enabling it to capture comprehensive features. Additionally, guided by the constrained extraction of features using maximum entropy coding, the self-supervised learning model learns more comprehensive features on top of learning consistent features. In addition, we provide theoretical support for our proposed method from information theory and causal counterfactual perspective. Experimental results show that our method achieves significant improvement in classification, object detection and instance segmentation tasks on multiple benchmark datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text. Advances in Neural Information Processing Systems, 34: 24206–24221.
  2. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems, 33: 12449–12460.
  3. Contrastive and non-contrastive self-supervised learning recover global and local spectral embedding methods. Advances in Neural Information Processing Systems, 35: 26671–26685.
  4. VICReg: Variance-Invariance-Covariance Regularization For Self-Supervised Learning. In ICLR 2022-International Conference on Learning Representations.
  5. Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems, 33: 9912–9924.
  6. A simple framework for contrastive learning of visual representations. In International conference on machine learning, 1597–1607. PMLR.
  7. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 15750–15758.
  8. An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9640–9649.
  9. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics, 215–223. JMLR Workshop and Conference Proceedings.
  10. Cover, T. M. 1999. Elements of information theory. John Wiley & Sons.
  11. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552.
  12. Whitening for self-supervised representation learning. In International Conference on Machine Learning, 3015–3024. PMLR.
  13. On the duality between contrastive and non-contrastive self-supervised learning. arXiv preprint arXiv:2206.02574.
  14. Bootstrap your own latent-a new approach to self-supervised learning. Advances in neural information processing systems, 33: 21271–21284.
  15. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, 297–304. JMLR Workshop and Conference Proceedings.
  16. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9729–9738.
  17. Uncertainty-based information: elements of generalized information theory, volume 15. Springer Science & Business Media.
  18. Krizhevsky, A. 2009. Learning Multiple Layers of Features from Tiny Images. Master’s thesis, University of Tront.
  19. Tiny imagenet visual recognition challenge. CS 231N, 7(7): 3.
  20. Metaug: Contrastive learning via meta feature augmentation. In International Conference on Machine Learning, 12964–12978. PMLR.
  21. Self-supervised learning with kernel dependence maximization. Advances in Neural Information Processing Systems, 34: 15543–15556.
  22. Neural manifold clustering and embedding. arXiv preprint arXiv:2201.10000.
  23. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, 740–755. Springer.
  24. Self-supervised learning via maximum entropy coding. Advances in Neural Information Processing Systems, 35: 34091–34105.
  25. Representation learning via invariant causal mechanisms. arXiv preprint arXiv:2010.07922.
  26. The book of why: the new science of cause and effect. Basic books.
  27. Interventional Contrastive Learning with Meta Semantic Regularizer. In International Conference on Machine Learning, 18018–18030. PMLR.
  28. Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3): 211–252.
  29. A survey on image data augmentation for deep learning. Journal of big data, 6(1): 1–48.
  30. Data augmentation using random image cropping and patching for deep CNNs. IEEE Transactions on Circuits and Systems for Video Technology, 30(9): 2917–2931.
  31. Contrastive multiview coding. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16, 776–794. Springer.
  32. What makes for good views for contrastive learning? Advances in neural information processing systems, 33: 6827–6839.
  33. Towards domain-agnostic contrastive learning. In International Conference on Machine Learning, 10530–10541. PMLR.
  34. Rethinking minimal sufficient representation in contrastive learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16041–16050.
  35. What Should Not Be Contrastive in Contrastive Learning. In International Conference on Learning Representations.
  36. Identity-disentangled adversarial augmentation for self-supervised learning. In International Conference on Machine Learning, 25364–25381. PMLR.
  37. Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning, 12310–12320. PMLR.
  38. Rethinking the augmentation module in contrastive learning: Learning hierarchical augmentation invariance with expanded views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16650–16659.
  39. Colorful image colorization. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14, 649–666. Springer.
  40. ReSSL: Relational Self-Supervised Learning with Weak Augmentation. Advances in Neural Information Processing Systems, 34.
  41. Pre-training Text-to-Text Transformers for Concept-centric Common Sense. In International Conference on Learning Representations.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Huijie Guo (6 papers)
  2. Ying Ba (2 papers)
  3. Jie Hu (187 papers)
  4. Lingyu Si (23 papers)
  5. Wenwen Qiang (55 papers)
  6. Lei Shi (262 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.