Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Dirichlet-Based Prediction Calibration for Learning with Noisy Labels (2401.07062v1)

Published 13 Jan 2024 in cs.LG and cs.AI

Abstract: Learning with noisy labels can significantly hinder the generalization performance of deep neural networks (DNNs). Existing approaches address this issue through loss correction or example selection methods. However, these methods often rely on the model's predictions obtained from the softmax function, which can be over-confident and unreliable. In this study, we identify the translation invariance of the softmax function as the underlying cause of this problem and propose the \textit{Dirichlet-based Prediction Calibration} (DPC) method as a solution. Our method introduces a calibrated softmax function that breaks the translation invariance by incorporating a suitable constant in the exponent term, enabling more reliable model predictions. To ensure stable model training, we leverage a Dirichlet distribution to assign probabilities to predicted labels and introduce a novel evidence deep learning (EDL) loss. The proposed loss function encourages positive and sufficiently large logits for the given label, while penalizing negative and small logits for other labels, leading to more distinct logits and facilitating better example selection based on a large-margin criterion. Through extensive experiments on diverse benchmark datasets, we demonstrate that DPC achieves state-of-the-art performance. The code is available at https://github.com/chenchenzong/DPC.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (45)
  1. Understanding and improving early stopping for learning with noisy labels. Advances in Neural Information Processing Systems, 34: 24392–24403.
  2. Mixmatch: A holistic approach to semi-supervised learning. Advances in neural information processing systems, 32.
  3. Active bias: Training more accurate neural networks by emphasizing high variance samples. Advances in Neural Information Processing Systems, 30.
  4. Understanding and utilizing deep neural networks trained with noisy labels. In International Conference on Machine Learning, 1062–1070. PMLR.
  5. gmission: A general spatial crowdsourcing platform. Proceedings of the VLDB Endowment, 7(13): 1629–1632.
  6. Learning with instance-dependent label noise: A sample sieve approach. arXiv preprint arXiv:2010.02347.
  7. Dempster, A. P. 1968. A generalization of Bayesian inference. Journal of the Royal Statistical Society: Series B (Methodological), 30(2): 205–232.
  8. Noise detection in the meta-learning level. Neurocomputing, 176: 14–25.
  9. Training deep neural-networks using a noise adaptation layer. In International conference on learning representations.
  10. On calibration of modern neural networks. In International conference on machine learning, 1321–1330. PMLR.
  11. Co-teaching: Robust training of deep neural networks with extremely noisy labels. Advances in neural information processing systems, 31.
  12. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 770–778.
  13. Identity mappings in deep residual networks. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part IV 14, 630–645. Springer.
  14. Crowdsourcing: a comprehensive literature review. Strategic Outsourcing: An International Journal, 8(1): 2–22.
  15. O2u-net: A simple noisy label detection approach for deep neural networks. In Proceedings of the IEEE/CVF international conference on computer vision, 3326–3334.
  16. Self-adaptive training: beyond empirical risk minimization. Advances in neural information processing systems, 33: 19365–19376.
  17. Asynchronous Active Learning with Distributed Label Querying. In IJCAI, 2570–2576.
  18. Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In International conference on machine learning, 2304–2313. PMLR.
  19. Unicon: Combating label noise through uniform selection and contrastive learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9676–9686.
  20. Joint negative and positive learning for noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9442–9451.
  21. Learning multiple layers of features from tiny images.
  22. Dividemix: Learning with noisy labels as semi-supervised learning. arXiv preprint arXiv:2002.07394.
  23. Webvision database: Visual learning and understanding from web data. arXiv preprint arXiv:1708.02862.
  24. Early-learning regularization prevents memorization of noisy labels. Advances in neural information processing systems, 33: 20331–20342.
  25. Robust training under label noise by over-parameterization. In International Conference on Machine Learning, 14153–14172. PMLR.
  26. Normalized loss functions for deep learning with noisy labels. In International conference on machine learning, 6543–6553. PMLR.
  27. Dimensionality-driven learning with noisy labels. In International Conference on Machine Learning, 3355–3364. PMLR.
  28. Decoupling” when to update” from” how to update”. Advances in neural information processing systems, 30.
  29. Multi-objective interpolation training for robustness to label noise. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 6606–6615.
  30. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE conference on computer vision and pattern recognition, 1944–1952.
  31. A study of Gaussian mixture models of color and texture features for image classification and segmentation. Pattern recognition, 39(4): 695–706.
  32. Training deep neural networks on noisy labels with bootstrapping. arXiv preprint arXiv:1412.6596.
  33. Evidential deep learning to quantify classification uncertainty. Advances in neural information processing systems, 31.
  34. Meta-weight-net: Learning an explicit mapping for sample weighting. Advances in neural information processing systems, 32.
  35. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence, volume 31.
  36. Rethinking calibration of deep neural networks: Do not be afraid of overconfidence. Advances in Neural Information Processing Systems, 34: 11809–11820.
  37. Learning with noisy labels revisited: A study using real-world human annotations. arXiv preprint arXiv:2110.12088.
  38. Dirichlet-based Uncertainty Calibration for Active Domain Adaptation. arXiv preprint arXiv:2302.13824.
  39. Probabilistic end-to-end noise correction for learning with noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 7017–7025.
  40. How does disagreement help generalization against label corruption? In International Conference on Machine Learning, 7164–7173. PMLR.
  41. Understanding deep learning (still) requires rethinking generalization. Communications of the ACM, 64(3): 107–115.
  42. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412.
  43. Generalized cross entropy loss for training deep neural networks with noisy labels. Advances in neural information processing systems, 31.
  44. Robust curriculum learning: from clean label detection to noisy label self-correction. In International Conference on Learning Representations.
  45. Noise-Robust Bidirectional Learning with Dynamic Sample Reweighting. arXiv preprint arXiv:2209.01334.
Citations (2)

Summary

We haven't generated a summary for this paper yet.