Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Binary Classification with Confidence Difference (2310.05632v1)

Published 9 Oct 2023 in cs.LG

Abstract: Recently, learning with soft labels has been shown to achieve better performance than learning with hard labels in terms of model generalization, calibration, and robustness. However, collecting pointwise labeling confidence for all training examples can be challenging and time-consuming in real-world scenarios. This paper delves into a novel weakly supervised binary classification problem called confidence-difference (ConfDiff) classification. Instead of pointwise labeling confidence, we are given only unlabeled data pairs with confidence difference that specifies the difference in the probabilities of being positive. We propose a risk-consistent approach to tackle this problem and show that the estimation error bound achieves the optimal convergence rate. We also introduce a risk correction approach to mitigate overfitting problems, whose consistency and convergence rate are also proven. Extensive experiments on benchmark data sets and a real-world recommender system data set validate the effectiveness of our proposed approaches in exploiting the supervision information of the confidence difference.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (74)
  1. MixMatch: A holistic approach to semi-supervised learning. In Advances in Neural Information Processing Systems 32, pages 5050–5060, 2019.
  2. Semi-Supervised Learning. The MIT Press, 2006.
  3. Introduction to semi-supervised learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 3(1):1–130, 2009.
  4. Towards making unlabeled data never hurt. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1):175–188, 2015.
  5. Co-teaching: Robust training of deep neural networks with extremely noisy labels. In Advances in Neural Information Processing Systems 31, pages 8536–8546, 2018.
  6. MoPro: Webly supervised learning with momentum prototypes. In Proceedings of the 9th International Conference on Learning Representations, 2021.
  7. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pages 1944–1952, 2017.
  8. Learning from noisy labels with complementary loss functions. In Proceedings of the 35th AAAI Conference on Artificial Intelligence, pages 10111–10119, 2021a.
  9. Learning with noisy labels revisited: A study using real-world human annotations. In Proceedings of the 10th International Conference on Learning Representations, 2022.
  10. Analysis of learning from positive and unlabeled data. In Advances in Neural Information Processing Systems 27, pages 703–711, 2014.
  11. Positive-unlabeled learning from imbalanced data. In Proceedings of the 30th International Joint Conference on Artificial Intelligence, pages 2995–3001, 2021.
  12. Rethinking class-prior estimation for positive-unlabeled learning. In Proceedings of the 10th International Conference on Learning Representations, 2022.
  13. PiCO: Contrastive label disambiguation for partial label learning. In Proceedings of the 10th International Conference on Learning Representations, 2022.
  14. Learning from partial labels. Journal of Machine Learning Research, 12(May):1501–1536, 2011.
  15. Semi-supervised partial label learning via confidence-rated margin maximization. In Advances in Neural Information Processing Systems 33, pages 6982–6993, 2020.
  16. Leveraged weighted loss for partial label learning. In Proceedings of the 38th International Conference on Machine Learning, pages 11091–11100, 2021.
  17. Revisiting consistency regularization for deep partial label learning. In Proceedings of the 39th International Conference on Machine Learning, pages 24212–24225, 2022.
  18. On the minimal supervision for training any binary classifier from only unlabeled data. In Proceedings of the 7th International Conference on Learning Representations, 2019.
  19. Mitigating overfitting in supervised classification from two unlabeled datasets: A consistent risk correction approach. In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, pages 1115–1125, 2020.
  20. Classification from pairwise similarity and unlabeled data. In Proceedings of the 35th International Conference on Machine Learning, pages 461–470, 2018.
  21. Pairwise supervision can provably elicit a decision boundary. In Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, pages 2618–2640, 2022.
  22. Learning from similarity-confidence data. In Proceedings of the 38th International Conference on Machine Learning, pages 1272–1282, 2021.
  23. Rethinking the inception architecture for computer vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, pages 2818–2826, 2016.
  24. Learning from biased soft labels. In Advances in Neural Information Processing Systems 36, 2023, in press.
  25. Revisiting knowledge distillation via label smoothing regularization. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3903–3911, 2020.
  26. Is the performance of my deep network too good to be true? A direct approach to estimating the bayes error in binary classification. In Proceedings of the 11th International Conference on Learning Representations, 2023.
  27. When does label smoothing help? In Advances in Neural Information Processing Systems 32, pages 4696–4705, 2019.
  28. Rethinking calibration of deep neural networks: Do not be afraid of overconfidence. In Advances in Neural Information Processing Systems 34, pages 11809–11820, 2021b.
  29. Does label smoothing mitigate label noise? In Proceedings of the 37th International Conference on Machine Learning, pages 6448–6458, 2020.
  30. Bag of tricks for adversarial training. In Proceedings of the 9th International Conference on Learning Representations, 2021.
  31. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  32. Towards understanding knowledge distillation. In Proceedings of the 36th International Conference on Machine Learning, pages 5142–5151, 2019.
  33. Relational knowledge distillation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3967–3976, 2019.
  34. Knowledge distillation: A survey. International Journal of Computer Vision, 129:1789–1819, 2021.
  35. Eliciting and learning with soft labels from every annotator. In Proceedings of the 10th AAAI Conference on Human Computation and Crowdsourcing, pages 40–52, 2022.
  36. Binary classification from positive data with skewed confidence. In Proceedings of the 29th International Joint Conferences on Artificial Intelligence, pages 3328–3334, 2020.
  37. On the informativeness of supervision signals. In Proceedings of the 39th Conference on Uncertainty in Artificial Intelligence, pages 2036–2046, 2023.
  38. Deep learning with noisy labels: Exploring techniques and remedies in medical image analysis. Medical Image Analysis, 65:101759, 2020.
  39. Data programming: Creating large training sets, quickly. In Advances in Neural Information Processing Systems 29, pages 3567–3575, 2016.
  40. Deep learning based recommender system: A survey and new perspectives. ACM Computing Surveys, 52(1):1–38, 2019.
  41. Adaptive domain interest network for multi-domain recommendation. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, page 3212–3221, 2022.
  42. Self-supervised learning for large-scale item recommendations. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, page 4321–4330, 2021.
  43. Pointwise binary classification with pairwise confidence comparisons. In Proceedings of the 38th International Conference on Machine Learning, pages 3252–3262, 2021.
  44. Learning to rank using gradient descent. In Proceedings of the 22nd International Conference on Machine Learning, pages 89–96, 2005.
  45. Learning to rank: From pairwise approach to listwise approach. In Proceedings of the 24th International Conference on Machine Learning, pages 129–136, 2007.
  46. Active ranking using pairwise comparisons. In Advances in Neural Information Processing Systems 24, pages 2240–2248, 2011.
  47. Active classification with comparison queries. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science, pages 355–366, 2017.
  48. Preference completion: Large-scale collaborative ranking from pairwise comparisons. In Proceedings of the 32nd International Conference on Machine Learning, pages 1907–1916, 2015.
  49. Feeling the bern: Adaptive estimators for bernoulli probabilities of pairwise comparisons. IEEE Transactions on Information Theory, 65(8):4854–4874, 2019.
  50. Noise-tolerant interactive learning using pairwise comparisons. In Advances in Neural Information Processing Systems 30, pages 2428–2437, 2017.
  51. Tie-Yan Liu. Learning to Rank for Information Retrieval. Springer, 2011.
  52. Robust subjective visual property prediction from crowdsourced pairwise labels. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(3):563–577, 2015.
  53. Uncoupled regression from pairwise comparison data. In Advances in Neural Information Processing Systems 32, pages 3992–4002, 2019.
  54. Regression with comparisons: Escaping the curse of dimensionality with ordinal information. Journal of Machine Learning Research, 21(1):6480–6533, 2020.
  55. Pairwise ranking aggregation in a crowdsourced setting. In Proceedings of the 6th ACM International Conference on Web Search and Data Mining, pages 193–202, 2013.
  56. Efficient PAC learning from the crowd with pairwise comparisons. In Proceedings of the 39th International Conference on Machine Learning, pages 25973–25993, 2022.
  57. GNNRank: Learning global rankings from pairwise comparisons via directed graph neural networks. In Proceedings of the 39th International Conference on Machine Learning, pages 8581–8612, 2022.
  58. Size-independent sample complexity of neural networks. In Proceedings of the 31st Conference On Learning Theory, pages 297–299, 2018.
  59. Shahar Mendelson. Lower bounds for the empirical minimization algorithm. IEEE Transactions on Information Theory, 54(8):3797–3803, 2008.
  60. Foundations of Machine Learning. The MIT Press, 2012.
  61. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  62. Deep learning for classical Japanese literature. arXiv preprint arXiv:1812.01718, 2018.
  63. Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
  64. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009.
  65. UCI machine learning repository, 2017.
  66. Deep residual learning for image recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
  67. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning, pages 807–814, 2010.
  68. Batch Normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the 32nd International Conference on Machine Learning, pages 448–456, 2015.
  69. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8026–8037, 2019.
  70. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, 2015.
  71. DivideMix: Learning with noisy labels as semi-supervised learning. In Proceedings of the 8th International Conference on Learning Representations, 2020.
  72. KuaiRec: A fully-observed dataset and insights for evaluating recommender systems. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 540–550, 2022.
  73. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web, pages 173–182, 2017.
  74. BPR: Bayesian personalized ranking from implicit feedback. In Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence, pages 452–461, 2009.
Citations (4)

Summary

We haven't generated a summary for this paper yet.