ChiMera: Learning with noisy labels by contrasting mixed-up augmentations (2310.05183v1)
Abstract: Learning with noisy labels has been studied to address incorrect label annotations in real-world applications. In this paper, we present ChiMera, a two-stage learning-from-noisy-labels framework based on semi-supervised learning, developed based on a novel contrastive learning technique MixCLR. The key idea of MixCLR is to learn and refine the representations of mixed augmentations from two different images to better resist label noise. ChiMera jointly learns the representations of the original data distribution and mixed-up data distribution via MixCLR, introducing many additional augmented samples to fill in the gap between different classes. This results in a more smoothed representation space learned by contrastive learning with better alignment and a more robust decision boundary. By exploiting MixCLR, ChiMera also improves the label diffusion process in the semi-supervised noise recovery stage and further boosts its ability to diffuse correct label information. We evaluated ChiMera on seven real-world datasets and obtained state-of-the-art performance on both symmetric noise and asymmetric noise. Our method opens up new avenues for using contrastive learning on learning with noisy labels and we envision MixCLR to be broadly applicable to other applications.
- C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning requires rethinking generalization,” arXiv preprint arXiv:1611.03530, 2016.
- J. Li, R. Socher, and S. C. Hoi, “Dividemix: Learning with noisy labels as semi-supervised learning,” arXiv preprint arXiv:2002.07394, 2020.
- H. Zhang and Q. Yao, “Decoupling representation and classifier for noisy label learning,” arXiv preprint arXiv:2011.08145, 2020.
- J. Han, P. Luo, and X. Wang, “Deep self-learning from noisy labels,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5138–5147, 2019.
- D. Tanaka, D. Ikami, T. Yamasaki, and K. Aizawa, “Joint optimization framework for learning with noisy labels,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5552–5560, 2018.
- K. Yi and J. Wu, “Probabilistic end-to-end noise correction for learning with noisy labels,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7017–7025, 2019.
- S. Li, X. Xia, S. Ge, and T. Liu, “Selective-supervised contrastive learning with noisy labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- G. Zhao, G. Li, Y. Qin, F. Liu, and Y. Yu, “Centrality and consistency: two-stage clean samples identification for learning with instance-dependent noisy labels,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXV. Springer, 2022, pp. 21–37.
- Y. Li, H. Han, S. Shan, and X. Chen, “Disc: Learning from noisy labels via dynamic instance-specific selection and correction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 24 070–24 079.
- G. Patrini, A. Rozza, A. Krishna Menon, R. Nock, and L. Qu, “Making deep neural networks robust to label noise: A loss correction approach,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1944–1952, 2017.
- J. Goldberger and E. Ben-Reuven, “Training deep neural-networks using a noise adaptation layer,” 5th International Conference on Learning Representations, 2016.
- S. Liu, J. Niles-Weed, N. Razavian, and C. Fernandez-Granda, “Early-learning regularization prevents memorization of noisy labels,” Advances in Neural Information Processing Systems, vol. 33, 2020.
- Y. Bai, E. Yang, B. Han, Y. Yang, J. Li, Y. Mao, G. Niu, and T. Liu, “Understanding and improving early stopping for learning with noisy labels,” Advances in Neural Information Processing Systems, vol. 34, 2021.
- E. Englesson and H. Azizpour, “Generalized jensen-shannon divergence loss for learning with noisy labels,” arXiv preprint arXiv:2105.04522, 2021.
- H. Bae, S. Shin, B. Na, J. Jang, K. Song, and I.-C. Moon, “From noisy prediction to true label: Noisy prediction calibration via generative model,” in International Conference on Machine Learning. PMLR, 2022, pp. 1277–1297.
- J. Yan, L. Luo, C. Xu, C. Deng, and H. Huang, “Noise is also useful: Negative correlation-steered latent contrastive learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 31–40.
- L. Yi, S. Liu, Q. She, A. I. McLeod, and B. Wang, “On learning contrastive representations for learning with noisy labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16 682–16 691.
- T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” International conference on machine learning, pp. 1597–1607, 2020.
- K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9729–9738, 2020.
- T. Chen, S. Kornblith, K. Swersky, M. Norouzi, and G. E. Hinton, “Big self-supervised models are strong semi-supervised learners,” Advances in Neural Information Processing Systems, vol. 33, pp. 22 243–22 255, 2020.
- X. Chen, H. Fan, R. Girshick, and K. He, “Improved baselines with momentum contrastive learning,” arXiv preprint arXiv:2003.04297, 2020.
- J. Li, C. Xiong, and S. C. Hoi, “Learning from noisy data with robust representation learning,” Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021.
- E. Zheltonozhskii, C. Baskin, A. Mendelson, A. M. Bronstein, and O. Litany, “Contrast to divide: Self-supervised pre-training for learning with noisy labels,” arXiv preprint arXiv:2103.13646, 2021.
- T. Wang and P. Isola, “Understanding contrastive representation learning through alignment and uniformity on the hypersphere,” in International Conference on Machine Learning. PMLR, 2020, pp. 9929–9939.
- R. S. Zimmermann, Y. Sharma, S. Schneider, M. Bethge, and W. Brendel, “Contrastive learning inverts the data generating process,” in International Conference on Machine Learning. PMLR, 2021, pp. 12 979–12 990.
- D. Cheng, T. Liu, Y. Ning, N. Wang, B. Han, G. Niu, X. Gao, and M. Sugiyama, “Instance-dependent label-noise learning with manifold-regularized transition matrix estimation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 16 630–16 639.
- S. Yang, E. Yang, B. Han, Y. Liu, M. Xu, G. Niu, and T. Liu, “Estimating instance-dependent bayes-label transition matrix using a deep neural network,” in International Conference on Machine Learning. PMLR, 2022, pp. 25 302–25 312.
- Z. Jiang, K. Zhou, Z. Liu, L. Li, R. Chen, S.-H. Choi, and X. Hu, “An information fusion approach to learning with instance-dependent label noise,” in International Conference on Learning Representations, 2022.
- B. Zhang, Y. Li, Y. Tu, J. Peng, Y. Wang, C. Wu, Y. Xiao, and C. Zhao, “Learning from noisy labels with coarse-to-fine sample credibility modeling,” in Computer Vision–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part II. Springer, 2023, pp. 21–38.
- Y. Wang, X. Sun, and Y. Fu, “Scalable penalized regression for noise detection in learning with noisy labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 346–355.
- L. Jiang, Z. Zhou, T. Leung, L.-J. Li, and L. Fei-Fei, “Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels,” International Conference on Machine Learning, pp. 2304–2313, 2018.
- B. Han, Q. Yao, X. Yu, G. Niu, M. Xu, W. Hu, I. Tsang, and M. Sugiyama, “Co-teaching: Robust training of deep neural networks with extremely noisy labels,” Advances in neural information processing systems, pp. 8527–8537, 2018.
- E. Malach and S. Shalev-Shwartz, “Decoupling” when to update” from” how to update”,” Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 961–971, 2017.
- S. Liu, Z. Zhu, Q. Qu, and C. You, “Robust training under label noise by over-parameterization,” in International Conference on Machine Learning. PMLR, 2022, pp. 14 153–14 172.
- D. Cheng, Y. Ning, N. Wang, X. Gao, H. Yang, Y. Du, B. Han, and T. Liu, “Class-dependent label-noise learning with cycle-consistency regularization,” Advances in Neural Information Processing Systems, vol. 35, pp. 11 104–11 116, 2022.
- A. Iscen, J. Valmadre, A. Arnab, and C. Schmid, “Learning with neighbor consistency for noisy labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 4672–4681.
- Y. Tu, B. Zhang, Y. Li, L. Liu, J. Li, Y. Wang, C. Wang, and C. R. Zhao, “Learning from noisy labels with decoupled meta label purifier,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19 934–19 943.
- Z. Sun, F. Shen, D. Huang, Q. Wang, X. Shu, Y. Yao, and J. Tang, “Pnp: Robust learning from noisy labels by probabilistic noise prediction,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 5311–5320.
- X. Xia, T. Liu, B. Han, M. Gong, J. Yu, G. Niu, and M. Sugiyama, “Sample selection with uncertainty of losses for learning with noisy labels,” arXiv preprint arXiv:2106.00445, 2021.
- H. Cheng, Z. Zhu, X. Li, Y. Gong, X. Sun, and Y. Liu, “Learning with instance-dependent label noise: A sample sieve approach,” arXiv preprint arXiv:2010.02347, 2020.
- C. Feng, Y. Ren, and X. Xie, “Ot-filter: An optimal transport filter for learning with noisy labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 16 164–16 174.
- R. Xiao, Y. Dong, H. Wang, L. Feng, R. Wu, G. Chen, and J. Zhao, “Promix: Combating label noise via maximizing clean sample utility,” in Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, E. Elkind, Ed. International Joint Conferences on Artificial Intelligence Organization, 8 2023, pp. 4442–4450, main Track. [Online]. Available: https://doi.org/10.24963/ijcai.2023/494
- Y. Bai and T. Liu, “Me-momentum: Extracting hard confident examples from noisily labeled data,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9312–9321, 2021.
- F. R. Cordeiro, R. Sachdeva, V. Belagiannis, I. Reid, and G. Carneiro, “Longremix: Robust learning with high confidence samples in a noisy label environment,” arXiv preprint arXiv:2103.04173, 2021.
- K. Nishi, Y. Ding, A. Rich, and T. Hollerer, “Augmentation strategies for learning with noisy labels,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8022–8031, 2021.
- D. Berthelot, N. Carlini, I. Goodfellow, N. Papernot, A. Oliver, and C. Raffel, “Mixmatch: A holistic approach to semi-supervised learning,” arXiv preprint arXiv:1905.02249, 2019.
- D. Berthelot, N. Carlini, E. D. Cubuk, A. Kurakin, K. Sohn, H. Zhang, and C. Raffel, “Remixmatch: Semi-supervised learning with distribution matching and augmentation anchoring,” International Conference on Learning Representations, 2019.
- L. Zhang, Z.-H. Tian, and W. Wang, “Learning from long-tailed noisy data with sample selection and balanced loss,” arXiv preprint arXiv:2211.10906, 2022.
- Z. Zhu, T. Liu, and Y. Liu, “A second-order approach to learning with instance-dependent label noise,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10 113–10 123, June 2021.
- A. Berthon, B. Han, G. Niu, T. Liu, and M. Sugiyama, “Confidence scores make instance-dependent label-noise learning possible,” International Conference on Machine Learning, pp. 825–836, 2021.
- Z.-F. Wu, T. Wei, J. Jiang, C. Mao, M. Tang, and Y.-F. Li, “Ngc: A unified framework for learning with open-world noisy data,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 62–71, 2021.
- Y. Xue, K. Whitecross, and B. Mirzasoleiman, “Investigating why contrastive learning benefits robustness against label noise,” in International Conference on Machine Learning. PMLR, 2022, pp. 24 851–24 871.
- Z. Huang, J. Zhang, and H. Shan, “Twin contrastive learning with noisy labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 11 661–11 670.
- A. v. d. Oord, Y. Li, and O. Vinyals, “Representation learning with contrastive predictive coding,” arXiv preprint arXiv:1807.03748, 2018.
- J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. H. Richemond, E. Buchatskaya, C. Doersch, B. A. Pires, Z. D. Guo, M. G. Azar et al., “Bootstrap your own latent: A new approach to self-supervised learning,” arXiv preprint arXiv:2006.07733, 2020.
- X. Chen and K. He, “Exploring simple siamese representation learning,” arXiv preprint arXiv:2011.10566, 2020.
- P. Khosla, P. Teterwak, C. Wang, A. Sarna, Y. Tian, P. Isola, A. Maschinot, C. Liu, and D. Krishnan, “Supervised contrastive learning,” Advances in Neural Information Processing Systems, vol. 33, 2020.
- L. Aitchison, “Infonce is a variational autoencoder,” arXiv preprint arXiv:2107.02495, 2021.
- H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” arXiv preprint arXiv:1710.09412, 2017.
- L. Zhang, Z. Deng, K. Kawaguchi, A. Ghorbani, and J. Zou, “How does mixup help with robustness and generalization?” arXiv preprint arXiv:2010.04819, 2020.
- K. Sohn, D. Berthelot, N. Carlini, Z. Zhang, H. Zhang, C. A. Raffel, E. D. Cubuk, A. Kurakin, and C.-L. Li, “Fixmatch: Simplifying semi-supervised learning with consistency and confidence,” Advances in Neural Information Processing Systems, vol. 33, 2020.
- V. Verma, T. Luong, K. Kawaguchi, H. Pham, and Q. Le, “Towards domain-agnostic contrastive learning,” in International Conference on Machine Learning. PMLR, 2021, pp. 10 530–10 541.
- S. Kim, G. Lee, S. Bae, and S.-Y. Yun, “Mixco: Mix-up contrastive learning for visual representation,” arXiv preprint arXiv:2010.06300, 2020.
- Z. Shen, Z. Liu, Z. Liu, M. Savvides, T. Darrell, and E. Xing, “Un-mix: Rethinking image mixtures for unsupervised visual representation learning,” arXiv preprint arXiv:2003.05438, 2020.
- S. Zhang, M. Liu, J. Yan, H. Zhang, L. Huang, P. Lu, and X. Yang, “m𝑚mitalic_m-mix: Generating hard negatives via multiple samples mixing for contrastive learning,” ICLR 2022 openreview, 2021.
- K. Lee, Y. Zhu, K. Sohn, C.-L. Li, J. Shin, and H. Lee, “i-mix: A domain-agnostic strategy for contrastive representation learning,” arXiv preprint arXiv:2010.08887, 2020.
- Y. You, T. Chen, Y. Sui, T. Chen, Z. Wang, and Y. Shen, “Graph contrastive learning with augmentations,” Advances in neural information processing systems, vol. 33, pp. 5812–5823, 2020.
- L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
- X. Xia, T. Liu, B. Han, N. Wang, M. Gong, H. Liu, G. Niu, D. Tao, and M. Sugiyama, “Part-dependent label noise: Towards instance-dependent label noise,” Advances in Neural Information Processing Systems, vol. 33, pp. 7597–7610, 2020.
- A. Krizhevsky, G. Hinton et al., “Learning multiple layers of features from tiny images,” Citeseer, 2009.
- J. Wei, Z. Zhu, H. Cheng, T. Liu, G. Niu, and Y. Liu, “Learning with noisy labels revisited: A study using real-world human annotations,” arXiv preprint arXiv:2110.12088, 2021.
- T. Xiao, T. Xia, Y. Yang, C. Huang, and X. Wang, “Learning from massive noisy labeled data for image classification,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2691–2699, 2015.
- W. Li, L. Wang, W. Li, E. Agustsson, and L. V. Gool, “Webvision database: Visual learning and understanding from web data.” CoRR, 2017.
- H. Song, M. Kim, and J.-G. Lee, “Selfie: Refurbishing unclean samples for robust deep learning,” International Conference on Machine Learning, pp. 5907–5915, 2019.
- S. Zhang, M. Liu, J. Yan, H. Zhang, L. Huang, X. Yang, and P. Lu, “M-mix: Generating hard negatives via multi-sample mixing for contrastive learning,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 2461–2470.
- K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European conference on computer vision. Springer, 2016, pp. 630–645.
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
- C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” in Thirty-first AAAI conference on artificial intelligence, 2017.
- J. Li, C. Xiong, and S. C. Hoi, “Mopro: Webly supervised learning with momentum prototypes,” arXiv preprint arXiv:2009.07995, 2020.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017.