Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transferring Annotator- and Instance-dependent Transition Matrix for Learning from Crowds (2306.03116v3)

Published 5 Jun 2023 in cs.HC, cs.AI, and cs.LG

Abstract: Learning from crowds describes that the annotations of training data are obtained with crowd-sourcing services. Multiple annotators each complete their own small part of the annotations, where labeling mistakes that depend on annotators occur frequently. Modeling the label-noise generation process by the noise transition matrix is a power tool to tackle the label noise. In real-world crowd-sourcing scenarios, noise transition matrices are both annotator- and instance-dependent. However, due to the high complexity of annotator- and instance-dependent transition matrices (AIDTM), annotation sparsity, which means each annotator only labels a little part of instances, makes modeling AIDTM very challenging. Prior works simplify the problem by assuming the transition matrix is instance-independent or using simple parametric ways, which lose modeling generality. Motivated by this, we target a more realistic problem, estimating general AIDTM in practice. Without losing modeling generality, we parameterize AIDTM with deep neural networks. To alleviate the modeling challenge, we suppose every annotator shares its noise pattern with similar annotators, and estimate AIDTM via knowledge transfer. We hence first model the mixture of noise patterns by all annotators, and then transfer this modeling to individual annotators. Furthermore, considering that the transfer from the mixture of noise patterns to individuals may cause two annotators with highly different noise generations to perturb each other, we employ the knowledge transfer between identified neighboring annotators to calibrate the modeling. Theoretical analyses are derived to demonstrate that both the knowledge transfer from global to individuals and the knowledge transfer between neighboring individuals can help model general AIDTM. Experiments confirm the superiority of the proposed approach on synthetic and real-world crowd-sourcing data.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (99)
  1. B. Han, Q. Yao, Y. Pan et al., “Millionaire: a hint-guided approach for crowdsourcing,” Machine Learning, vol. 108, no. 5, pp. 831–858, 2019.
  2. H. Song, M. Kim, D. Park, and J. Lee, “Learning from noisy labels with deep neural networks: A survey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 11, pp. 8135–8153, 2023.
  3. M. Servajean, A. Joly, D. Shasha et al., “Crowdsourcing thousands of specialized labels: A bayesian active training approach,” IEEE Transactions on Multimedia, vol. 19, no. 6, pp. 1376–1391, 2017.
  4. K. Han, Y. Wang, H. Chen et al., “A survey on vision transformer,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 87–110, 2022.
  5. T. Hospedales, A. Antoniou, P. Micaelli, and A. Storkey, “Meta-learning in neural networks: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 9, pp. 5149–5169, 2021.
  6. F.-A. Croitoru, V. Hondru, R. T. Ionescu, and M. Shah, “Diffusion models in vision: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 9, pp. 10 850–10 869, 2023.
  7. B. Han, Q. Yao, X. Yu et al., “Co-teaching: Robust training of deep neural networks with extremely noisy labels,” in Proc. Adv. Neural Inform. Process. Syst., 2018, pp. 8536–8546.
  8. X. Xia, T. Liu, B. Han et al., “Sample selection with uncertainty of losses for learning with noisy labels,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/pdf?id=xENf4QUL4LW
  9. S. Li, X. Xia, S. Ge, and T. Liu, “Selective-supervised contrastive learning with noisy labels,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog. (CVPR), 2022, pp. 316–325.
  10. S. Li, X. Xia, H. Zhang et al., “Estimating noise transition matrix with label correlations for noisy multi-label learning,” in Proc. Adv. Neural Inform. Process. Syst., 2022, pp. 24 184–24 198.
  11. X. Zhou, X. Liu, D. Zhai et al., “Asymmetric loss functions for noise-tolerant learning: Theory and applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 7, pp. 8094–8109, 2023.
  12. K. Fatras, B. B. Damodaran, S. Lobry et al., “Wasserstein adversarial regularization for learning with label noise,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 10, pp. 7296–7306, 2021.
  13. Q. Xu, Z. Yang, Y. Jiang et al., “Deep robust subjective visual property prediction in crowdsourcing,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog. (CVPR), 2019, pp. 8985–8993.
  14. S. Li, W. Deng, and J. Du, “Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), 2017, pp. 2584–2593.
  15. L. Zhang, R. Tanno, M. Xu et al., “Disentangling human error from ground truth in segmentation of medical images,” in Proc. Adv. Neural Inform. Process. Syst., 2020, pp. 15 750–15 762.
  16. X. Xia, T. Liu, B. Han et al., “Robust early-learning: Hindering the memorization of noisy labels,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/pdf?id=Eql5b1_hTE4
  17. J. Wei, Z. Zhu, T. Luo et al., “To aggregate or not? learning with separate noisy labels,” in ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023, pp. 2523–2535.
  18. V. S. Sheng, F. J. Provost, and P. G. Ipeirotis, “Get another label? improving data quality and data mining using multiple, noisy labelers,” in ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2008, pp. 614–622.
  19. J. Zhang, “Knowledge learning with crowdsourcing: A brief review and systematic perspective,” IEEE/CAA Journal of Automatica Sinica, vol. 9, no. 5, pp. 749–762, 2022.
  20. R. Snow, B. O’Connor, D. Jurafsky, and A. Y. Ng, “Cheap and fast - but is it good? evaluating non-expert annotations for natural language tasks,” in Empirical Methods in Natural Language Processing, 2008, pp. 254–263.
  21. Y. Zheng, G. Li, Y. Li et al., “Truth inference in crowdsourcing: Is the problem solved?” Proceedings of the VLDB Endowment, vol. 10, no. 5, pp. 541–552, 2017.
  22. V. Sharmanska, D. Hernández-Lobato, J. M. Hernández-Lobato, and N. Quadrianto, “Ambiguity helps: Classification with disagreements in crowdsourced annotations,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), 2016, pp. 2194–2202.
  23. P. G. Ipeirotis, F. J. Provost, V. S. Sheng, and J. Wang, “Repeated labeling using multiple noisy labelers,” Data Mining and Knowledge Discovery, vol. 28, no. 2, pp. 402–441, 2014.
  24. S. Zeng and J. Shen, “Efficient PAC learning from the crowd with pairwise comparisons,” in International Conference on Machine Learning, 2022, pp. 25 973–25 993.
  25. G. Schoenebeck and B. Tao, “Wisdom of the crowd voting: Truthful aggregation of voter information and preferences,” in Proc. Adv. Neural Inform. Process. Syst., 2021, pp. 1872–1883.
  26. L. Jiang, H. Zhang, F. Tao, and C. Li, “Learning from crowds with multiple noisy label distribution propagation,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 11, pp. 6558–6568, 2022.
  27. A. P. Dawid and A. M. Skene, “Maximum likelihood estimation of observer error-rates using the em algorithm,” Journal of the Royal Statistical Society, vol. 28, no. 1, pp. 20–28, 1979.
  28. X. Xia, T. Liu, B. Han et al., “Part-dependent label noise: Towards instance-dependent label noise,” in Proc. Adv. Neural Inform. Process. Syst., 2020, pp. 7597–7610.
  29. V. C. Raykar, S. Yu, L. H. Zhao et al., “Learning from crowds,” Journal of Machine Learning Research, vol. 11, no. 4, pp. 1297–1322, 2010.
  30. A. Khetan, A. Anandkumar, and Z. C. Lipton, “Learning from noisy singly labeled data,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/pdf?id=H1sUHgb0Z
  31. S. Ibrahim, T. Nguyen, and X. Fu, “Deep learning from crowdsourced labels: Coupled cross-entropy minimization, identifiability, and regularization,” in International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/pdf?id=_qVhsWyWB9
  32. R. Tanno, A. Saeedi, S. Sankaranarayanan et al., “Learning from noisy labels by regularized estimation of annotator confusion,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog. (CVPR), 2019, pp. 11 236–11 245.
  33. Z. Chu, J. Ma, and H. Wang, “Learning from crowds by modeling common confusions,” in AAAI Conference on Artificial Intelligence, 2021, pp. 5832–5840.
  34. Q. Ma and A. Olshevsky, “Adversarial crowdsourcing through robust rank-one matrix completion,” in Proc. Adv. Neural Inform. Process. Syst., 2020, pp. 21 841–21 852.
  35. P. Welinder, S. Branson, P. Perona, and S. Belongie, “The multidimensional wisdom of crowds,” in Proc. Adv. Neural Inform. Process. Syst., 2010, pp. 2424–2432.
  36. P. Ruvolo, J. Whitehill, and J. R. Movellan, “Exploiting structure in crowdsourcing tasks via latent factor models,” University of California San Diego, Tech. Rep., 2010. [Online]. Available: https://inc.ucsd.edu/~paul/crowdsourcingtr.pdf
  37. Y. Yan, R. Rosales, G. Fung et al., “Learning from multiple annotators with varying expertise,” Machine Learning, vol. 95, no. 3, pp. 291–327, 2014.
  38. ——, “Modeling annotator expertise: Learning when everybody knows a bit of something,” in International Conference on Artificial Intelligence and Statistics, 2010, pp. 932–939.
  39. W. Bi, L. Wang, J. T. Kwok, and Z. Tu, “Learning to predict from crowdsourced data,” in Uncertainty in Artificial Intelligence, 2014, pp. 82–91.
  40. H. Li and B. Yu, “Error rate bounds and iterative weighted majority voting for crowdsourcing,” arXiv:1411.4086, 2014. [Online]. Available: https://arxiv.org/abs/1411.4086
  41. Q. Li, Y. Li, J. Gao et al., “Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation,” in International Conference on Management of Data, 2014, pp. 1187–1198.
  42. T. Tian and J. Zhu, “Max-margin majority voting for learning from crowds,” in Proc. Adv. Neural Inform. Process. Syst., 2015, pp. 1621–1629.
  43. D. R. Karger, S. Oh, and D. Shah, “Budget-optimal crowdsourcing using low-rank matrix approximations,” in Allerton Conference on Communication, Control, and Computing, 2011, pp. 284–291.
  44. S. R. Cachay, B. Boecking, and A. Dubrawski, “End-to-end weak supervision,” in Proc. Adv. Neural Inform. Process. Syst., 2021, pp. 1845–1857.
  45. R. Wu, S.-E. Chen, J. Zhang, and X. Chu, “Learning hyper label model for programmatic weak supervision,” in International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/pdf?id=aCQt_BrkSjC
  46. J. Whitehill, P. Ruvolo, T. Wu et al., “Whose vote should count more: Optimal integration of labels from labelers of unknown expertise,” in Proc. Adv. Neural Inform. Process. Syst., 2009, pp. 2035–2043.
  47. D. Zhou, J. C. Platt, S. Basu, and Y. Mao, “Learning from the wisdom of crowds by minimax entropy,” in Proc. Adv. Neural Inform. Process. Syst., 2012, pp. 2204–2212.
  48. S. H. Bach, B. D. He, A. Ratner, and C. Ré, “Learning the structure of generative models without labeled data,” in International Conference on Machine Learning, 2017, pp. 273–282.
  49. D. Zhou, Q. Liu, J. C. Platt, and C. Meek, “Aggregating ordinal labels from crowds by minimax conditional entropy,” in International Conference on Machine Learning, 2014, pp. 262–270.
  50. E. Simpson, S. J. Roberts, I. Psorakis, and A. M. Smith, “Dynamic bayesian combination of multiple imperfect classifiers,” in Decision Making and Imperfection, 2013, vol. 474, pp. 1–35.
  51. L. Yin, J. Han, W. Zhang, and Y. Yu, “Aggregating crowd wisdoms with label-aware autoencoders,” in International Joint Conference on Artificial Intelligence, 2017, pp. 1325–1331.
  52. P. Chen, H. Sun, Y. Yang, and Z. Chen, “Adversarial learning from crowds,” in AAAI Conference on Artificial Intelligence, 2022, pp. 5304–5312.
  53. A. Ratner, S. H. Bach, H. Ehrenberg et al., “Snorkel: Rapid training data creation with weak supervision,” Proceedings of the VLDB Endowment, vol. 11, no. 3, pp. 269–282, 2017.
  54. H. Kim and Z. Ghahramani, “Bayesian classifier combination,” in International Conference on Artificial Intelligence and Statistics, 2012, pp. 619–627.
  55. H. Wei, R. Xie, L. Feng et al., “Deep learning from multiple noisy annotators as a union,” IEEE Transactions on Neural Networks and Learning Systems, vol. 34, no. 12, pp. 10 552–10 562, 2023.
  56. Z. Chen, H. Wang, H. Sun et al., “Structured probabilistic end-to-end learning from crowds,” in International Joint Conference on Artificial Intelligence, 2020, pp. 1512–1518.
  57. F. Rodrigues and F. C. Pereira, “Deep learning from crowds,” in AAAI Conference on Artificial Intelligence, 2018, pp. 1611–1618.
  58. Z. Gao, F.-K. Sun, M. Yang et al., “Learning from multiple annotator noisy labels via sample-wise label fusion,” in European Conference on Computer Vision, 2022, pp. 407–422.
  59. Y. Yao, T. Liu, M. Gong et al., “Instance-dependent label-noise learning under a structural causal model,” in Proc. Adv. Neural Inform. Process. Syst., 2021, pp. 4409–4420.
  60. S. Yang, E. Yang, B. Han et al., “Estimating instance-dependent bayes-label transition matrix using a deep neural network,” in International Conference on Machine Learning, 2022, pp. 25 302–25 312.
  61. Z. Zhu, Y. Song, and Y. Liu, “Clusterability as an alternative to anchor points when learning with noisy labels,” in International Conference on Machine Learning, 2021, pp. 12 912–12 923.
  62. J. Wei, Z. Zhu, H. Cheng et al., “Learning with noisy labels revisited: A study using real-world human annotations,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/pdf?id=TBWA6PLJZQm
  63. Z. Zhu, T. Liu, and Y. Liu, “A second-order approach to learning with instance-dependent label noise,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog. (CVPR), 2021, pp. 10 113–10 123.
  64. X. Xia, B. Han, N. Wang et al., “Extended t𝑡titalic_t: Learning with mixed closed-set and open-set noisy labels,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3047–3058, 2022.
  65. Z. Jiang, K. Zhou, Z. Liu et al., “An information fusion approach to learning with instance-dependent label noise,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/pdf?id=ecH2FKaARUp
  66. J. Cheng, T. Liu, K. Ramamohanarao, and D. Tao, “Learning with bounded instance-and label-dependent label noise,” in International Conference on Machine Learning, 2020, pp. 1789–1799.
  67. Y. Liu, H. Cheng, and K. Zhang, “Identifiability of label noise transition matrix,” in International Conference on Machine Learning, 2023, pp. 21 475–21 496.
  68. S. E. Palmer, “Hierarchical structure in perceptual representation,” Cognitive Psychology, vol. 9, no. 4, pp. 441–474, 1977.
  69. J. T. Wixted and L. R. Squire, “The role of the human hippocampus in familiarity-based and recollection-based recognition memory,” Behavioural Brain Research, vol. 215, no. 2, pp. 197–208, 2010.
  70. J. B. Wilmer, L. Germine, C. F. Chabris et al., “Capturing specific abilities as a window into human individuality: The example of face recognition,” Cognitive Neuropsychology, vol. 29, no. 5-6, pp. 360–392, 2012.
  71. S. Ben-David, J. Blitzer, K. Crammer et al., “A theory of learning from different domains,” Machine learning, vol. 79, no. 1-2, pp. 151–175, 2010.
  72. T. M. Cover and P. E. Hart, “Nearest neighbor pattern classification,” IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 21–27, 1967.
  73. R. Wang, S. Mou, X. Wang et al., “Graph structure estimation neural networks,” in The Web Conference, 2021, pp. 342–353.
  74. N. Entezari, S. A. Al-Sayouri, A. Darvishzadeh, and E. E. Papalexakis, “All you need is low (rank): Defending against adversarial attacks on graphs,” in Web Search and Data Mining, 2020, pp. 169–177.
  75. Z. Chen, X. Wei, P. Wang, and Y. Guo, “Multi-label image recognition with graph convolutional networks,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog. (CVPR), 2019, pp. 5177–5186.
  76. T. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in International Conference on Learning Representations, 2017. [Online]. Available: https://openreview.net/pdf?id=SJU4ayYgl
  77. G. Patrini, A. Rozza, A. K. Menon et al., “Making deep neural networks robust to label noise: A loss correction approach,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. (CVPR), 2017, pp. 2233–2241.
  78. X. Xia, T. Liu, N. Wang et al., “Are anchor points really indispensable in label-noise learning?” in Proc. Adv. Neural Inform. Process. Syst., 2019, pp. 6835–6846.
  79. D. McNamara and M.-F. Balcan, “Risk bounds for transferring representations with and without fine-tuning,” in International Conference on Machine Learning, 2017, pp. 2373–2381.
  80. T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE Transactions on Information Theory, vol. 13, no. 1, pp. 21–27, 1967.
  81. Z. Zhang and M. Sabuncu, “Generalized cross entropy loss for training deep neural networks with noisy labels,” in Proc. Adv. Neural Inform. Process. Syst., 2018, pp. 8778–8788.
  82. T. Liu and D. Tao, “Classification with noisy labels by importance reweighting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 3, pp. 447–461, 2016.
  83. P. Cao, Y. Xu, Y. Kong, and Y. Wang, “Max-mig: an information theoretic approach for joint learning from crowds,” in International Conference on Learning Representations, 2019. [Online]. Available: https://openreview.net/pdf?id=BJg9DoR9t7
  84. Y. Li, B. I. P. Rubinstein, and T. Cohn, “Exploiting worker correlation for label aggregation in crowdsourcing,” in International Conference on Machine Learning, 2019, pp. 3886–3895.
  85. H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,” arXiv:1708.07747, 2017. [Online]. Available: https://arxiv.org/abs/1708.07747v2
  86. T. Clanuwat, M. Bober-Irizar, A. Kitamoto et al., “Deep learning for classical japanese literature,” arXiv:1812.01718, 2018. [Online]. Available: https://arxiv.org/abs/1812.01718
  87. A. Krizhevsky, “Learning multiple layers of features from tiny images,” University of Toronto, Tech. Rep., 2009. [Online]. Available: https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
  88. Y. Netzer, T. Wang, A. Coates et al., “Reading digits in natural images with unsupervised feature learning,” in NIPS Workshop on Deep Learning and Unsupervised Feature Learning, no. 5, 2011, p. 7.
  89. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recog. (CVPR), 2016, pp. 770–778.
  90. F. Rodrigues, F. C. Pereira, and B. Ribeiro, “Gaussian process classification and active learning with multiple annotators,” in International Conference on Machine Learning, 2014, pp. 433–441.
  91. J. Yao, Y. Zhang, S. Zheng et al., “Learning to segment from noisy annotations: A spatial correction approach,” in International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/pdf?id=Qc_OopMEBnC
  92. A. Chowdhery, S. Narang, J. Devlin et al., “Palm: Scaling language modeling with pathways,” Journal of Machine Learning Research, vol. 24, pp. 240:1–240:113, 2023.
  93. P. Velickovic, G. Cucurull, A. Casanova et al., “Graph attention networks,” in International Conference on Learning Representations, 2018. [Online]. Available: https://openreview.net/pdf?id=rJXMpikCZ
  94. B. Perozzi, R. Al-Rfou, and S. Skiena, “Deepwalk: Online learning of social representations,” in ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2014, pp. 701–710.
  95. S. Kairam and J. Heer, “Parting crowds: Characterizing divergent interpretations in crowdsourced annotation tasks,” in ACM Conference on Computer-Supported Cooperative Work & Social Computing, 2016, pp. 1637–1648.
  96. M. Venanzi, J. Guiver, G. Kazai et al., “Community-based bayesian aggregation models for crowdsourcing,” in The Web Conference, 2014, pp. 155–164.
  97. Y. Tian, D. Krishnan, and P. Isola, “Contrastive multiview coding,” in European Conference on Computer Vision, 2020, pp. 776–794.
  98. P. L. Bartlett and S. Mendelson, “Rademacher and gaussian complexities: Risk bounds and structural results,” Journal of Machine Learning Research, vol. 3, no. 11, pp. 463–482, 2003.
  99. N. Golowich, A. Rakhlin, and O. Shamir, “Size-independent sample complexity of neural networks,” in Conference on Learning Theory, 2018, pp. 297–299.
Citations (7)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com

GitHub

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets