Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ArCL: Enhancing Contrastive Learning with Augmentation-Robust Representations (2303.01092v2)

Published 2 Mar 2023 in cs.LG, cs.AI, and cs.CV

Abstract: Self-Supervised Learning (SSL) is a paradigm that leverages unlabeled data for model training. Empirical studies show that SSL can achieve promising performance in distribution shift scenarios, where the downstream and training distributions differ. However, the theoretical understanding of its transferability remains limited. In this paper, we develop a theoretical framework to analyze the transferability of self-supervised contrastive learning, by investigating the impact of data augmentation on it. Our results reveal that the downstream performance of contrastive learning depends largely on the choice of data augmentation. Moreover, we show that contrastive learning fails to learn domain-invariant features, which limits its transferability. Based on these theoretical insights, we propose a novel method called Augmentation-robust Contrastive Learning (ArCL), which guarantees to learn domain-invariant features and can be easily integrated with existing contrastive learning algorithms. We conduct experiments on several datasets and show that ArCL significantly improves the transferability of contrastive learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (54)
  1. Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
  2. A theoretical analysis of contrastive unsupervised representation learning. arXiv preprint arXiv:1902.09229, 2019.
  3. Investigating the role of negatives in contrastive representation learning. arXiv preprint arXiv:2106.09943, 2021.
  4. On the surrogate gap between contrastive and supervised losses. In International Conference on Machine Learning, pp. 1585–1606. PMLR, 2022.
  5. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 2013.
  6. Generalizing from several related classification tasks to a new unlabeled sample. Proc. of NeurIPS, 2011.
  7. Food-101 - mining discriminative components with random forests. In ECCV, 2014.
  8. A simple framework for contrastive learning of visual representations. In Proc. of ICML, 2020.
  9. Exploring simple siamese representation learning. In Proc. of CVPR, 2021.
  10. Describing textures in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2014.
  11. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.  248–255, 2009. doi: 10.1109/CVPR.2009.5206848.
  12. Learning models with uniform performance via distributionally robust optimization. The Annals of Statistics, 2021.
  13. Statistics of robust optimization: A generalized empirical likelihood approach. Mathematics of Operations Research, 2021.
  14. How well do self-supervised models transfer? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  5414–5423, June 2021.
  15. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In 2004 Conference on Computer Vision and Pattern Recognition Workshop, pp.  178–178, 2004. doi: 10.1109/CVPR.2004.383.
  16. Self-supervised pretraining of visual features in the wild. arXiv preprint arXiv:2103.01988, 2021.
  17. Bootstrap your own latent-a new approach to self-supervised learning. Proc. of NeurIPS, 2020.
  18. Beyond separability: Analyzing the linear transferability of contrastive representations to related subpopulations. arXiv preprint arXiv:2204.02683, 2022.
  19. Momentum contrast for unsupervised visual representation learning. In Proc. of CVPR, 2020.
  20. Learning deep representations by mutual information estimation and maximization. arXiv preprint arXiv:1808.06670, 2018.
  21. Your contrastive learning is secretly doing stochastic neighbor embedding. arXiv preprint arXiv:2205.14814, 2022.
  22. Towards the generalization of contrastive self-supervised learning. arXiv preprint arXiv:2111.00743, 2021.
  23. Generative models as a data source for multiview representation learning. arXiv preprint arXiv:2106.05258, 2021.
  24. Robust pre-training by adversarial contrastive learning. Proc. of NeurIPS, 2020.
  25. Domain extrapolation via regret minimization. arXiv preprint arXiv:2006.03908, 2020.
  26. Disentangling by factorising. In Proc. of ICML, 2018.
  27. Adversarial self-supervised contrastive learning. Proc. of NeurIPS, 2020.
  28. Collecting a large-scale dataset of fine-grained cars. 2013.
  29. Alex Krizhevsky. Learning multiple layers of features from tiny images. 2009.
  30. Out-of-distribution generalization via risk extrapolation (rex). In Proc. of ICML, 2021.
  31. Stable prediction with model misspecification and agnostic distribution shift. In Proc. of AAAI, 2020.
  32. Learning causal semantic representation for out-of-distribution prediction. arXiv preprint arXiv:2011.01681, 2020.
  33. Self-supervised learning is more robust to dataset imbalance. arXiv preprint arXiv:2110.05025, 2021.
  34. Domain generalization using causal matching. In Proc. of ICML, 2021.
  35. Fine-grained visual classification of aircraft, 2013. URL https://arxiv.org/abs/1306.5151.
  36. Domain generalization via invariant feature representation. In Proc. of ICML, 2013.
  37. Automated flower classification over a large number of classes. In 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing, pp.  722–729, 2008. doi: 10.1109/ICVGIP.2008.47.
  38. Cats and dogs. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp.  3498–3505, 2012. doi: 10.1109/CVPR.2012.6248092.
  39. Causal inference by using invariant prediction: identification and confidence intervals. Journal of the Royal Statistical Society. Series B (Statistical Methodology), 2016.
  40. Invariant models for causal transfer learning. The Journal of Machine Learning Research, 2018.
  41. Distributionally robust neural networks. In Proc. of ICLR, 2019.
  42. The pitfalls of simplicity bias in neural networks. In NeurIPS.
  43. Connect, not collapse: Explaining contrastive learning for unsupervised domain adaptation. arXiv preprint arXiv:2204.00570, 2022.
  44. Stable learning via sample reweighting. In Proc. of AAAI, 2020.
  45. Towards out-of-distribution generalization: A survey. arXiv preprint arXiv:2108.13624, 2021.
  46. How robust are pre-trained models to distribution shift? arXiv preprint arXiv:2206.08871, 2022.
  47. Contrastive multiview coding. In Proc. of ECCV, 2020.
  48. On disentangled representations learned from correlated data. In Proc. of ICML, 2021.
  49. On mutual information maximization for representation learning. arXiv preprint arXiv:1907.13625, 2019.
  50. Self-supervised learning with data augmentations provably isolates content from style. arXiv preprint arXiv:2106.04619, 2021.
  51. Generalizing to unseen domains: A survey on domain generalization. arXiv preprint arXiv:2103.03097, 2021a.
  52. Self-supervised learning disentangled group representation as feature. Proc. of NeurIPS, 2021b.
  53. Towards a theoretical framework of out-of-distribution generalization. arXiv preprint arXiv:2106.04496, 2021.
  54. Barlow twins: Self-supervised learning via redundancy reduction. In Proc. of ICML, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Xuyang Zhao (13 papers)
  2. Tianqi Du (8 papers)
  3. Yisen Wang (120 papers)
  4. Jun Yao (36 papers)
  5. Weiran Huang (54 papers)
Citations (13)

Summary

We haven't generated a summary for this paper yet.