Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Contrastive Learning with Efficient Combinatorial Positive Pairing (2401.05730v1)

Published 11 Jan 2024 in cs.CV and cs.AI

Abstract: In the past few years, contrastive learning has played a central role for the success of visual unsupervised representation learning. Around the same time, high-performance non-contrastive learning methods have been developed as well. While most of the works utilize only two views, we carefully review the existing multi-view methods and propose a general multi-view strategy that can improve learning speed and performance of any contrastive or non-contrastive method. We first analyze CMC's full-graph paradigm and empirically show that the learning speed of $K$-views can be increased by ${K}\mathrm{C}{2}$ times for small learning rate and early training. Then, we upgrade CMC's full-graph by mixing views created by a crop-only augmentation, adopting small-size views as in SwAV multi-crop, and modifying the negative sampling. The resulting multi-view strategy is called ECPP (Efficient Combinatorial Positive Pairing). We investigate the effectiveness of ECPP by applying it to SimCLR and assessing the linear evaluation performance for CIFAR-10 and ImageNet-100. For each benchmark, we achieve a state-of-the-art performance. In case of ImageNet-100, ECPP boosted SimCLR outperforms supervised learning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644, 2016.
  2. Vicreg: Variance-invariance-covariance regularization for self-supervised learning. arXiv preprint arXiv:2105.04906, 2021.
  3. Unsupervised learning of visual features by contrasting cluster assignments. Advances in Neural Information Processing Systems, 33:9912–9924, 2020.
  4. A simple framework for contrastive learning of visual representations. In International conference on machine learning, pages 1597–1607. PMLR, 2020a.
  5. Big self-supervised models are strong semi-supervised learners. Advances in neural information processing systems, 33:22243–22255, 2020b.
  6. Intriguing properties of contrastive losses. Advances in Neural Information Processing Systems, 34, 2021.
  7. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 15750–15758, 2021.
  8. Improved baselines with momentum contrastive learning. arXiv preprint arXiv:2003.04297, 2020c.
  9. Debiased contrastive learning. Advances in neural information processing systems, 33:8765–8775, 2020.
  10. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE international conference on computer vision, pages 1422–1430, 2015.
  11. Unsupervised representation learning by predicting image rotations. arXiv preprint arXiv:1803.07728, 2018.
  12. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677, 2017.
  13. Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems, 33:21271–21284, 2020.
  14. Smart mining for deep metric learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 2821–2829, 2017.
  15. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
  16. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020.
  17. On feature decorrelation in self-supervised learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9598–9608, 2021.
  18. Boosting contrastive self-supervised learning with false negative cancellation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2785–2795, 2022.
  19. Mining on manifolds: Metric learning without labels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7642–7651, 2018.
  20. i-revnet: Deep invertible networks. arXiv preprint arXiv:1802.07088, 2018.
  21. Hard negative mixing for contrastive learning. Advances in Neural Information Processing Systems, 33:21798–21809, 2020.
  22. Alex Krizhevsky. One weird trick for parallelizing convolutional neural networks. arXiv preprint arXiv:1404.5997, 2014.
  23. Learning multiple layers of features from tiny images. 2009.
  24. Pytorch distributed: Experiences on accelerating data parallel training. arXiv preprint arXiv:2006.15704, 2020.
  25. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
  26. Revisiting small batch training for deep neural networks. arXiv preprint arXiv:1804.07612, 2018.
  27. Mixed precision training. arXiv preprint arXiv:1710.03740, 2017.
  28. Ishan Misra and Laurens van der Maaten. Self-supervised learning of pretext-invariant representations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6707–6717, 2020.
  29. Deepusps: Deep robust unsupervised saliency prediction via self-supervision. Advances in Neural Information Processing Systems, 32, 2019.
  30. Unsupervised learning of visual representations by solving jigsaw puzzles. In European conference on computer vision, pages 69–84. Springer, 2016.
  31. Deep metric learning via lifted structured feature embedding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4004–4012, 2016.
  32. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2536–2544, 2016.
  33. Contrastive learning with hard negative samples. arXiv preprint arXiv:2010.04592, 2020.
  34. Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3):211–252, 2015.
  35. Leslie N Smith. Cyclical learning rates for training neural networks. In 2017 IEEE winter conference on applications of computer vision (WACV), pages 464–472. IEEE, 2017.
  36. Super-convergence: Very fast training of neural networks using large learning rates. In Artificial intelligence and machine learning for multi-domain operations applications, volume 11006, page 1100612. International Society for Optics and Photonics, 2019.
  37. Stochastic class-based hard example mining for deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7251–7259, 2019.
  38. Contrastive multiview coding. In European conference on computer vision, pages 776–794. Springer, 2020.
  39. Pushing the limits of self-supervised resnets: Can we outperform supervised learning without labels on imagenet? arXiv preprint arXiv:2201.05119, 2022.
  40. Fixing the train-test resolution discrepancy. Advances in neural information processing systems, 32, 2019.
  41. On mutual information maximization for representation learning. arXiv preprint arXiv:1907.13625, 2019.
  42. Understanding the behaviour of contrastive loss. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2495–2504, 2021.
  43. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In International Conference on Machine Learning, pages 9929–9939. PMLR, 2020.
  44. Sampling matters in deep embedding learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 2840–2848, 2017.
  45. Conditional negative sampling for contrastive learning of visual representations. arXiv preprint arXiv:2010.02037, 2020.
  46. Group normalization. In Proceedings of the European conference on computer vision (ECCV), pages 3–19, 2018.
  47. What should not be contrastive in contrastive learning. arXiv preprint arXiv:2008.05659, 2020.
  48. Hard negative examples are hard, but useful. In European Conference on Computer Vision, pages 126–142. Springer, 2020.
  49. Large batch training of convolutional networks. arXiv preprint arXiv:1708.03888, 2017.
  50. Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning, pages 12310–12320. PMLR, 2021.
  51. Context encoding for semantic segmentation. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 7151–7160, 2018.
  52. Contrastive attraction and contrastive repulsion for representation learning. arXiv preprint arXiv:2105.03746, 2021.
  53. Hardness-aware deep metric learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 72–81, 2019.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets