Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Co-Learning Meets Stitch-Up for Noisy Multi-label Visual Recognition (2307.00880v1)

Published 3 Jul 2023 in cs.CV

Abstract: In real-world scenarios, collected and annotated data often exhibit the characteristics of multiple classes and long-tailed distribution. Additionally, label noise is inevitable in large-scale annotations and hinders the applications of learning-based models. Although many deep learning based methods have been proposed for handling long-tailed multi-label recognition or label noise respectively, learning with noisy labels in long-tailed multi-label visual data has not been well-studied because of the complexity of long-tailed distribution entangled with multi-label correlation. To tackle such a critical yet thorny problem, this paper focuses on reducing noise based on some inherent properties of multi-label classification and long-tailed learning under noisy cases. In detail, we propose a Stitch-Up augmentation to synthesize a cleaner sample, which directly reduces multi-label noise by stitching up multiple noisy training samples. Equipped with Stitch-Up, a Heterogeneous Co-Learning framework is further designed to leverage the inconsistency between long-tailed and balanced distributions, yielding cleaner labels for more robust representation learning with noisy long-tailed data. To validate our method, we build two challenging benchmarks, named VOC-MLT-Noise and COCO-MLT-Noise, respectively. Extensive experiments are conducted to demonstrate the effectiveness of our proposed method. Compared to a variety of baselines, our method achieves superior results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
  2. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, vol. 25, 2012, pp. 1097–1105.
  3. D. Zhang, J. Han, G. Cheng, and M.-H. Yang, “Weakly supervised object localization and detection: A survey,” IEEE transactions on pattern analysis and machine intelligence, 2021.
  4. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.   Ieee, 2009, pp. 248–255.
  5. B. Zhou, Q. Cui, X.-S. Wei, and Z.-M. Chen, “Bbn: Bilateral-branch network with cumulative learning for long-tailed visual recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9719–9728.
  6. Z. Liu, Z. Miao, X. Zhan, J. Wang, B. Gong, and S. X. Yu, “Large-scale long-tailed recognition in an open world,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2537–2546.
  7. K. Cao, C. Wei, A. Gaidon, N. Arechiga, and T. Ma, “Learning imbalanced datasets with label-distribution-aware margin loss,” in Advances in Neural Information Processing Systems, vol. 32.   Curran Associates, Inc., 2019.
  8. B. Kang, S. Xie, M. Rohrbach, Z. Yan, A. Gordo, J. Feng, and Y. Kalantidis, “Decoupling representation and classifier for long-tailed recognition,” in International Conference on Learning Representations, 2020.
  9. J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, and W. Xu, “Cnn-rnn: A unified framework for multi-label image classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2016, pp. 2285–2294.
  10. Z.-M. Chen, X.-S. Wei, P. Wang, and Y. Guo, “Multi-label image recognition with graph convolutional networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 5177–5186.
  11. Z. Ji, B. Cui, H. Li, Y.-G. Jiang, T. Xiang, T. Hospedales, and Y. Fu, “Deep ranking for image zero-shot multi-label classification,” IEEE Transactions on Image Processing, vol. 29, pp. 6549–6560, 2020.
  12. C. Sun, A. Shrivastava, S. Singh, and A. Gupta, “Revisiting unreasonable effectiveness of data in deep learning era,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 843–852.
  13. X. Wu, J. Chang, Y.-K. Lai, J. Yang, and Q. Tian, “Bispl: Bidirectional self-paced learning for recognition from web data,” IEEE Transactions on Image Processing, vol. 30, pp. 6512–6527, 2021.
  14. M. Ye, H. Li, B. Du, J. Shen, L. Shao, and S. C. Hoi, “Collaborative refining for person re-identification with label noise,” IEEE Transactions on Image Processing, vol. 31, pp. 379–391, 2021.
  15. P. Huang, J. Han, N. Liu, J. Ren, and D. Zhang, “Scribble-supervised video object segmentation,” IEEE/CAA Journal of Automatica Sinica, vol. 9, no. 2, pp. 339–353, 2021.
  16. J. Li, R. Socher, and S. C. Hoi, “Dividemix: Learning with noisy labels as semi-supervised learning,” in International Conference on Learning Representations, 2020.
  17. J. Li, C. Xiong, and S. Hoi, “Mopro: Webly supervised learning with momentum prototypes,” in International Conference on Learning Representations, 2021.
  18. T. Wu, Q. Huang, Z. Liu, Y. Wang, and D. Lin, “Distribution-balanced loss for multi-label classification in long-tailed datasets,” in Proceedings of the European conference on computer vision.   Springer, 2020, pp. 162–178.
  19. H. Guo and S. Wang, “Long-tailed multi-label visual recognition by collaborative training on uniform and re-balanced samplings,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 15 089–15 098.
  20. B. Han, Q. Yao, X. Yu, G. Niu, M. Xu, W. Hu, I. Tsang, and M. Sugiyama, “Co-teaching: Robust training of deep neural networks with extremely noisy labels,” in Advances in Neural Information Processing Systems, vol. 31.   Curran Associates, Inc., 2018.
  21. M. Ren, W. Zeng, B. Yang, and R. Urtasun, “Learning to reweight examples for robust deep learning,” in International Conference on Machine Learning.   PMLR, 2018, pp. 4334–4343.
  22. J. Yang, L. Feng, W. Chen, X. Yan, H. Zheng, P. Luo, and W. Zhang, “Webly supervised image classification with self-contained confidence,” in Proceedings of the European conference on computer vision.   Springer, 2020, pp. 779–795.
  23. X. Yu, B. Han, J. Yao, G. Niu, I. Tsang, and M. Sugiyama, “How does disagreement help generalization against label corruption?” in International Conference on Machine Learning.   PMLR, 2019, pp. 7164–7173.
  24. L. Shen, Z. Lin, and Q. Huang, “Relay backpropagation for effective learning of deep convolutional neural networks,” in European conference on computer vision.   Springer, 2016, pp. 467–482.
  25. D. Mahajan, R. Girshick, V. Ramanathan, K. He, M. Paluri, Y. Li, A. Bharambe, and L. Van Der Maaten, “Exploring the limits of weakly supervised pretraining,” in Proceedings of the European conference on computer vision, 2018, pp. 181–196.
  26. H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions on knowledge and data engineering, vol. 21, no. 9, pp. 1263–1284, 2009.
  27. Y. Cui, M. Jia, T.-Y. Lin, Y. Song, and S. Belongie, “Class-balanced loss based on effective number of samples,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 9268–9277.
  28. S. H. Khan, M. Hayat, M. Bennamoun, F. A. Sohel, and R. Togneri, “Cost-sensitive learning of deep feature representations from imbalanced data,” IEEE transactions on neural networks and learning systems, vol. 29, no. 8, pp. 3573–3587, 2017.
  29. J. Tan, C. Wang, B. Li, Q. Li, W. Ouyang, C. Yin, and J. Yan, “Equalization loss for long-tailed object recognition,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 11 662–11 671.
  30. Y.-X. Wang, D. Ramanan, and M. Hebert, “Learning to model the tail,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 7032–7042.
  31. Z. Li, K. Kamnitsas, and B. Glocker, “Overfitting of neural nets under class imbalance: Analysis and improvements for segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention.   Springer, 2019, pp. 402–410.
  32. Y. Cui, Y. Song, C. Sun, A. Howard, and S. Belongie, “Large scale fine-grained categorization and domain-specific transfer learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 4109–4118.
  33. Y. Zhong, W. Deng, M. Wang, J. Hu, J. Peng, X. Tao, and Y. Huang, “Unequal-training for deep face recognition with long-tailed noisy data,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 7812–7821.
  34. M. A. Jamal, M. Brown, M.-H. Yang, L. Wang, and B. Gong, “Rethinking class-balanced methods for long-tailed visual recognition from a domain adaptation perspective,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 7610–7619.
  35. C. Huang, Y. Li, C. C. Loy, and X. Tang, “Deep imbalanced learning for face recognition and attribute prediction,” IEEE transactions on pattern analysis and machine intelligence, vol. 42, no. 11, pp. 2781–2794, 2019.
  36. G. Tsoumakas and I. Katakis, “Multi-label classification: An overview,” International Journal of Data Warehousing and Mining (IJDWM), vol. 3, no. 3, pp. 1–13, 2007.
  37. M.-L. Zhang and Z.-H. Zhou, “Ml-knn: A lazy learning approach to multi-label learning,” Pattern recognition, vol. 40, no. 7, pp. 2038–2048, 2007.
  38. A. Clare and R. D. King, “Knowledge discovery in multi-label phenotype data,” in European conference on principles of data mining and knowledge discovery.   Springer, 2001, pp. 42–53.
  39. A. Elisseeff and J. Weston, “A kernel method for multi-labelled classification,” in Advances in Neural Information Processing Systems, vol. 14, 2001, pp. 681–687.
  40. C.-W. Lee, W. Fang, C.-K. Yeh, and Y.-C. F. Wang, “Multi-label zero-shot learning with structured knowledge graphs,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 1576–1585.
  41. L. Wang, Y. Liu, H. Di, C. Qin, G. Sun, and Y. Fu, “Semi-supervised dual relation learning for multi-label classification,” IEEE Transactions on Image Processing, vol. 30, pp. 9125–9135, 2021.
  42. G.-S. Xie, X.-Y. Zhang, S. Yan, and C.-L. Liu, “Sde: A novel selective, discriminative and equalizing feature representation for visual recognition,” International Journal of Computer Vision, vol. 124, no. 2, pp. 145–168, 2017.
  43. Y.-P. Sun and M.-L. Zhang, “Compositional metric learning for multi-label classification,” Frontiers of Computer Science, vol. 15, pp. 1–12, 2021.
  44. Y. Yang, Y. Zhuang, and Y. Pan, “Multiple knowledge representation for big data artificial intelligence: framework, applications, and case studies,” Frontiers of Information Technology & Electronic Engineering, vol. 22, no. 12, pp. 1551–1558, 2021.
  45. C. Zhang, S. Bengio, M. Hardt, B. Recht, and O. Vinyals, “Understanding deep learning (still) requires rethinking generalization,” Communications of the ACM, vol. 64, no. 3, pp. 107–115, 2021.
  46. E. Arazo, D. Ortego, P. Albert, N. O’Connor, and K. McGuinness, “Unsupervised label noise modeling and loss correction,” in International Conference on Machine Learning.   PMLR, 2019, pp. 312–321.
  47. D. Arpit, S. Jastrzebski, N. Ballas, D. Krueger, E. Bengio, M. S. Kanwal, T. Maharaj, A. Fischer, A. Courville, Y. Bengio et al., “A closer look at memorization in deep networks,” in International Conference on Machine Learning.   PMLR, 2017, pp. 233–242.
  48. J. Shu, Q. Xie, L. Yi, Q. Zhao, S. Zhou, Z. Xu, and D. Meng, “Meta-weight-net: Learning an explicit mapping for sample weighting,” in Advances in Neural Information Processing Systems, vol. 32.   Curran Associates, Inc., 2019.
  49. Y. Xu, L. Zhu, Y. Yang, and F. Wu, “Training robust object detectors from noisy category labels and imprecise bounding boxes,” IEEE Transactions on Image Processing, vol. 30, pp. 5782–5792, 2021.
  50. H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond empirical risk minimization,” in International Conference on Learning Representations, 2018.
  51. S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, “Cutmix: Regularization strategy to train strong classifiers with localizable features,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 6023–6032.
  52. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988.
  53. J. Han, P. Luo, and X. Wang, “Deep self-learning from noisy labels,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 5138–5147.
Citations (7)

Summary

We haven't generated a summary for this paper yet.