Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Contrastive Learning with Negative Sampling Correction (2401.08690v1)

Published 13 Jan 2024 in cs.LG

Abstract: As one of the most effective self-supervised representation learning methods, contrastive learning (CL) relies on multiple negative pairs to contrast against each positive pair. In the standard practice of contrastive learning, data augmentation methods are utilized to generate both positive and negative pairs. While existing works have been focusing on improving the positive sampling, the negative sampling process is often overlooked. In fact, the generated negative samples are often polluted by positive samples, which leads to a biased loss and performance degradation. To correct the negative sampling bias, we propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL). PUCL treats the generated negative samples as unlabeled samples and uses information from positive samples to correct bias in contrastive loss. We prove that the corrected loss used in PUCL only incurs a negligible bias compared to the unbiased contrastive loss. PUCL can be applied to general contrastive learning problems and outperforms state-of-the-art methods on various image and graph classification tasks. The code of PUCL is in the supplementary file.

Introduction to Positive-Unlabeled Contrastive Learning (PUCL)

Contrastive Learning (CL) has gained prominence in the arena of self-supervised learning, effectively creating representations that are beneficial for a multitude of downstream tasks. The architecture of CL relies on forming positive and negative pairs from the data. Typically, positive pairs consist of transformed versions of the same data point, while negative pairs are formed from different data points. Despite significant progress in improving the sampling process for positive pairs, the method for generating negative pairs remains relatively unexplored, often leading to a mingling of positive samples within negative ones. This blend, known as negative sampling bias, can impede the overall performance of the learning process.

The Issue of Negative Sampling Bias

Negative sampling bias presents a serious challenge. During the formation of negative pairs, it's not uncommon for some pairs to actually be positive―a problem seen in areas such as image and graph classification tasks. Standard CL typically ignores this concern, boldly assuming that all unlabeled data can be treated as negative. This presumptuous approach can yield misleading loss evaluations and degrade model performance.

Introducing PUCL: A New Approach to Contrastive Learning

To tackle this bias, a novel methodology has been proposed: the Positive-Unlabeled Contrastive Learning (PUCL) technique. By treating negative samples as unlabeled, PUCL leverages information from positive samples to correct the bias present in contrastive loss. The beauty of PUCL rests in its ability to only induce a negligible bias compared to an unbiased contrastive loss—a theoretical spirit behind the technique shows this chasm to be minimal. The approach not only is applicable across different CL problems but also shows superior performance over the latest methods in numerous image and graph classification tasks.

Evaluation of PUCL's Performance

Empirical evidence positions PUCL favorably across various classification tasks. Through thorough experimentation, PUCL has been benchmarked against several state-of-the-art approaches in both image and graph classification scenarios. In addition to significantly outperforming competitors and displaying robustness to hyperparameter changes, PUCL has contributed to enhancing the base models it was applied to, illustrating its pragmatic utility and effectiveness.

Conclusion and Future Work

PUCL marks a forward leap in contradiction learning, addressing and correcting the negative sampling bias that impedes the learning process. Its broad applicability and consistent improvements over existing baselines on different datasets underpin its potential as a go-to method for those in the field. Future work may involve the development of methods to autonomously determine optimal hyperparameters, which would circumvent exhaustive search techniques currently employed. Expansion of PUCL to also introduce learned corrections for positive sample distribution, along with its current refinement of negative distribution, is an avenue worth exploring.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Learning from positive and unlabeled data: A survey. Machine Learning, 109(4): 719–760.
  2. Positive unlabeled learning with class-prior approximation. In Proceedings of IJCAI 2021, 2014–2021.
  3. A simple framework for contrastive learning of visual representations. In Proceedings of ICML 2020, 1597–1607. PMLR.
  4. Debiased contrastive learning. Proceedings of NeurIPS 2020.
  5. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of AISTATS 2011, 215–223.
  6. Learning classifiers from only positive and unlabeled data. In Proceedings of SIGKDD 2008, 213–220.
  7. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, 297–304. JMLR Workshop and Conference Proceedings.
  8. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 9729–9738.
  9. Deep residual learning for image recognition. In Proceedings of CVPR 2016, 770–778.
  10. Supervised contrastive learning. Proceedings of NeurIPS 2020.
  11. Learning multiple layers of features from tiny images.
  12. Learning with positive and unlabeled examples using weighted logistic regression. In Proceedings of ICML 2003, volume 3, 448–455.
  13. Building text classifiers using positive and unlabeled examples. In Proceedings of ICDM 2003, 179–186. IEEE.
  14. Partially supervised classification of text documents. In Proceedings of ICML 2002, volume 2, 387–394. Sydney, NSW.
  15. An efficient framework for learning sentence representations.
  16. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426.
  17. Self-supervised learning of pretext-invariant representations. In Proceedings of CVPR 2020, 6707–6717.
  18. Tudataset: A collection of benchmark datasets for learning with graphs. arXiv preprint.
  19. Representation learning with contrastive predictive coding. arXiv preprint.
  20. On variational lower bounds of mutual information. In NeurIPS Workshop on Bayesian Deep Learning.
  21. Contrastive learning with hard negative samples. Proceedings of ICLR 2021.
  22. A theoretical analysis of contrastive unsupervised representation learning.
  23. Curl: Contrastive unsupervised representations for reinforcement learning. Proceedings of ICML 2020.
  24. Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. Proceedings of ICLR 2020.
  25. Contrastive multiview coding. In Proceedings of ECCV 2020, 776–794. Springer.
  26. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. In Proceedings of ICML 2020, 9929–9939. PMLR.
  27. Conditional negative sampling for contrastive learning of visual representations. Proceedings of ICLR 2021.
  28. Unsupervised feature learning via non-parametric instance discrimination. In Proceedings of CVPR 2018, 3733–3742.
  29. Graph contrastive learning with augmentations. Advances in Neural Information Processing Systems, 33: 5812–5823.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (11)
  1. Lu Wang (329 papers)
  2. Chao Du (83 papers)
  3. Pu Zhao (82 papers)
  4. Chuan Luo (19 papers)
  5. Zhangchi Zhu (6 papers)
  6. Bo Qiao (18 papers)
  7. Wei Zhang (1489 papers)
  8. Qingwei Lin (81 papers)
  9. Saravan Rajmohan (85 papers)
  10. Dongmei Zhang (193 papers)
  11. Qi Zhang (785 papers)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets