Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Can We Break Free from Strong Data Augmentations in Self-Supervised Learning? (2404.09752v1)

Published 15 Apr 2024 in cs.CV, cs.AI, and cs.LG

Abstract: Self-supervised learning (SSL) has emerged as a promising solution for addressing the challenge of limited labeled data in deep neural networks (DNNs), offering scalability potential. However, the impact of design dependencies within the SSL framework remains insufficiently investigated. In this study, we comprehensively explore SSL behavior across a spectrum of augmentations, revealing their crucial role in shaping SSL model performance and learning mechanisms. Leveraging these insights, we propose a novel learning approach that integrates prior knowledge, with the aim of curtailing the need for extensive data augmentations and thereby amplifying the efficacy of learned representations. Notably, our findings underscore that SSL models imbued with prior knowledge exhibit reduced texture bias, diminished reliance on shortcuts and augmentations, and improved robustness against both natural and adversarial corruptions. These findings not only illuminate a new direction in SSL research, but also pave the way for enhancing DNN performance while concurrently alleviating the imperative for intensive data augmentation, thereby enhancing scalability and real-world problem-solving capabilities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. A comprehensive study of real-time object detection networks across multiple domains: A survey. Transactions on Machine Learning Research, 2022. ISSN 2835-8856. URL https://openreview.net/forum?id=ywr5sWqQt4. Survey Certification.
  2. Rsa: Reducing semantic shift from aggressive augmentations for self-supervised learning. In Advances in Neural Information Processing Systems, 2022.
  3. The effects of regularization and data augmentation are class dependent. arXiv preprint arXiv:2204.03632, 2022.
  4. Vicreg: Variance-invariance-covariance regularization for self-supervised learning. In 10th International Conference on Learning Representations, ICLR 2022, 2022.
  5. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pp. 1597–1607. PMLR, 2020a.
  6. Intriguing properties of contrastive losses. Advances in Neural Information Processing Systems, 34, 2021.
  7. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  15750–15758, 2021.
  8. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020b.
  9. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 88(2):303–338, Jun 2010. ISSN 1573-1405. doi: 10.1007/s11263-009-0275-4. URL https://doi.org/10.1007/s11263-009-0275-4.
  10. Imagenet-trained cnns are biased towards texture; increasing shape bias improves accuracy and robustness, 2019.
  11. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2(11):665–673, 2020.
  12. Unsupervised representation learning by predicting image rotations. In ICLR 2018, 2018.
  13. Inbiased: Inductive bias distillation to improve generalization and robustness through shape-awareness. In Conference on Lifelong Learning Agents, pp.  1026–1042. PMLR, 2022.
  14. Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems, 33:21271–21284, 2020.
  15. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  16. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  9729–9738, 2020.
  17. Benchmarking neural network robustness to common corruptions and perturbations. In International Conference on Learning Representations, 2018.
  18. The many faces of robustness: A critical analysis of out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  8340–8349, 2021.
  19. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  20. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE international conference on computer vision, pp.  1501–1510, 2017.
  21. Combining diverse feature priors. In International Conference on Machine Learning, pp. 9802–9832. PMLR, 2022.
  22. Patricia K Kuhl. A new view of language acquisition. Proceedings of the National Academy of Sciences, 97(22):11850–11857, 2000.
  23. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740–755. Springer, 2014.
  24. Ssd: Single shot multibox detector. In European conference on computer vision, pp.  21–37. Springer, 2016.
  25. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pp.  3730–3738, 2015.
  26. Towards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.
  27. Automatic shortcut removal for self-supervised representation learning. In International Conference on Machine Learning, pp. 6927–6937. PMLR, 2020.
  28. Learning visual representations for transfer learning by suppressing texture. arXiv preprint arXiv:2011.01901, 2020.
  29. Unsupervised learning of visual representations by solving jigsaw puzzles. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VI, pp. 69–84. Springer, 2016.
  30. The book of why: the new science of cause and effect. Basic books, 2018.
  31. Can contrastive learning avoid shortcut solutions? Advances in Neural Information Processing Systems, 34, 2021.
  32. Noise or signal: The role of image backgrounds in object recognition. In International Conference on Learning Representations, 2021.
  33. Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning, pp. 12310–12320. PMLR, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Shruthi Gowda (8 papers)
  2. Elahe Arani (59 papers)
  3. Bahram Zonooz (54 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com