Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Category-wise Fine-Tuning: Resisting Incorrect Pseudo-Labels in Multi-Label Image Classification with Partial Labels (2401.16991v1)

Published 30 Jan 2024 in cs.CV

Abstract: Large-scale image datasets are often partially labeled, where only a few categories' labels are known for each image. Assigning pseudo-labels to unknown labels to gain additional training signals has become prevalent for training deep classification models. However, some pseudo-labels are inevitably incorrect, leading to a notable decline in the model classification performance. In this paper, we propose a novel method called Category-wise Fine-Tuning (CFT), aiming to reduce model inaccuracies caused by the wrong pseudo-labels. In particular, CFT employs known labels without pseudo-labels to fine-tune the logistic regressions of trained models individually to calibrate each category's model predictions. Genetic Algorithm, seldom used for training deep models, is also utilized in CFT to maximize the classification performance directly. CFT is applied to well-trained models, unlike most existing methods that train models from scratch. Hence, CFT is general and compatible with models trained with different methods and schemes, as demonstrated through extensive experiments. CFT requires only a few seconds for each category for calibration with consumer-grade GPUs. We achieve state-of-the-art results on three benchmarking datasets, including the CheXpert chest X-ray competition dataset (ensemble mAUC 93.33%, single model 91.82%), partially labeled MS-COCO (average mAP 83.69%), and Open Image V3 (mAP 85.31%), outperforming the previous bests by 0.28%, 2.21%, 2.50%, and 0.91%, respectively. The single model on CheXpert has been officially evaluated by the competition server, endorsing the correctness of the result. The outstanding results and generalizability indicate that CFT could be substantial and prevalent for classification model development. Code is available at: https://github.com/maxium0526/category-wise-fine-tuning.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (66)
  1. J. Irvin, P. Rajpurkar, M. Ko, Y. Yu, S. Ciurea-Ilcus, C. Chute, H. Marklund, B. Haghgoo, R. Ball, K. Shpanskaya, and others, “Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, 2019, pp. 590–597.
  2. D. Huynh and E. Elhamifar, “Interactive multi-label cnn learning with partial labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 9423–9432.
  3. J. Lin, T. Yu, and Z. J. Wang, “Rethinking Crowdsourcing Annotation: Partial Annotation With Salient Labels for Multilabel Aerial Image Classification,” IEEE transactions on geoscience and remote sensing, vol. 60, pp. 1–12, 2022, publisher: IEEE.
  4. T. Ridnik, E. Ben-Baruch, N. Zamir, A. Noy, I. Friedman, M. Protter, and L. Zelnik-Manor, “Asymmetric loss for multi-label classification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 82–91.
  5. T. Kobayashi, “Two-Way Multi-Label Loss,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 7476–7485.
  6. T. Chen, M. Xu, X. Hui, H. Wu, and L. Lin, “Learning semantic-specific graph representation for multi-label image recognition,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 522–531.
  7. T. Durand, N. Mehrasa, and G. Mori, “Learning a deep convnet for multi-label classification with partial labels,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 647–657.
  8. X. Sun, P. Hu, and K. Saenko, “Dualcoop: Fast adaptation to multi-label recognition with limited annotations,” arXiv preprint arXiv:2206.09541, 2022.
  9. P. Hu, X. Sun, S. Sclaroff, and K. Saenko, “DualCoOp++: Fast and Effective Adaptation to Multi-Label Recognition with Limited Annotations,” arXiv preprint arXiv:2308.01890, 2023.
  10. Z. Ding, A. Wang, H. Chen, Q. Zhang, P. Liu, Y. Bao, W. Yan, and J. Han, “Exploring Structured Semantic Prior for Multi Label Recognition with Incomplete Labels,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 3398–3407.
  11. Y. Kim, J. M. Kim, Z. Akata, and J. Lee, “Large loss matters in weakly supervised multi-label classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 14 156–14 165.
  12. K. Kundu and J. Tighe, “Exploiting weakly supervised visual patterns to learn from partial annotations,” Advances in Neural Information Processing Systems, vol. 33, pp. 561–572, 2020.
  13. S. S. Bucak, R. Jin, and A. K. Jain, “Multi-label learning with incomplete class assignments,” in CVPR 2011.   IEEE, 2011, pp. 2801–2808.
  14. M. Chen, A. Zheng, and K. Weinberger, “Fast image tagging,” in International conference on machine learning.   PMLR, 2013, pp. 1274–1282.
  15. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft coco: Common objects in context,” in European conference on computer vision.   Springer, 2014, pp. 740–755.
  16. E. Ben-Baruch, T. Ridnik, I. Friedman, A. Ben-Cohen, N. Zamir, A. Noy, and L. Zelnik-Manor, “Multi-label classification with partial annotations using class-aware selective loss,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 4764–4772.
  17. T. Chen, T. Pu, H. Wu, Y. Xie, and L. Lin, “Structured semantic transfer for multi-label recognition with partial labels,” in Proceedings of the AAAI conference on artificial intelligence, vol. 36, 2022, pp. 339–346, issue: 1.
  18. E. Arazo, D. Ortego, P. Albert, N. E. O’Connor, and K. McGuinness, “Pseudo-labeling and confirmation bias in deep semi-supervised learning,” in 2020 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2020, pp. 1–8.
  19. Z. Zhang, F. Ringeval, B. Dong, E. Coutinho, E. Marchi, and B. Schüller, “Enhanced semi-supervised learning for multimodal emotion recognition,” in 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP).   IEEE, 2016, pp. 5185–5189.
  20. Z. Yuan, Y. Yan, M. Sonka, and T. Yang, “Large-scale robust deep auc maximization: A new surrogate loss and empirical studies on medical image classification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 3040–3049.
  21. C. F. Chong, Y. Wang, B. Ng, W. Luo, and X. Yang, “Image projective transformation rectification with synthetic data for smartphone-captured chest X-ray photos classification,” Computers in Biology and Medicine, p. 107277, 2023, publisher: Elsevier.
  22. Z. Yuan, K. Zhang, and T. Huang, “Positive Label Is All You Need for Multi-Label Classification,” arXiv preprint arXiv:2306.16016, 2023.
  23. A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, A. Kolesnikov, and others, “The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale,” International Journal of Computer Vision, vol. 128, no. 7, pp. 1956–1981, 2020, publisher: Springer.
  24. S. Wang, Q. Wan, X. Xiang, and Z. Zeng, “Saliency Regularization for Self-Training with Partial Annotations,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 1611–1620.
  25. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  26. M. Tan and Q. Le, “Efficientnetv2: Smaller models and faster training,” in International Conference on Machine Learning.   PMLR, 2021, pp. 10 096–10 106.
  27. S. Woo, S. Debnath, R. Hu, X. Chen, Z. Liu, I. S. Kweon, and S. Xie, “Convnext v2: Co-designing and scaling convnets with masked autoencoders,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 16 133–16 142.
  28. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, and others, “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
  29. Z. Liu, H. Hu, Y. Lin, Z. Yao, Z. Xie, Y. Wei, J. Ning, Y. Cao, Z. Zhang, L. Dong, and others, “Swin transformer v2: Scaling up capacity and resolution,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 12 009–12 019.
  30. C. F. Chong, X. Yang, T. Wang, W. Ke, and Y. Wang, “Category-Wise Fine-Tuning for Image Multi-label Classification with Partial Labels,” in International Conference on Neural Information Processing.   Springer, 2023, pp. 332–345.
  31. R. Cabral, F. Torre, J. P. Costeira, and A. Bernardino, “Matrix completion for multi-label image classification,” Advances in neural information processing systems, vol. 24, 2011.
  32. M. Xu, R. Jin, and Z.-H. Zhou, “Speedup matrix completion with side information: Application to multi-label learning,” Advances in neural information processing systems, vol. 26, 2013.
  33. H.-F. Yu, P. Jain, P. Kar, and I. Dhillon, “Large-scale multi-label learning with missing labels,” in International conference on machine learning.   PMLR, 2014, pp. 593–601.
  34. H. Yang, J. T. Zhou, and J. Cai, “Improving multi-label learning with missing labels by structured semantic correlations,” in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14.   Springer, 2016, pp. 835–851.
  35. B. Wu, S. Lyu, and B. Ghanem, “Ml-mg: Multi-label learning with missing labels using a mixed graph,” in Proceedings of the IEEE international conference on computer vision, 2015, pp. 4157–4165.
  36. A. Kapoor, R. Viswanathan, and P. Jain, “Multilabel classification using bayesian compressed sensing,” Advances in neural information processing systems, vol. 25, 2012.
  37. Y. Gong, Y. Jia, T. Leung, A. Toshev, and S. Ioffe, “Deep convolutional ranking for multilabel image annotation,” arXiv preprint arXiv:1312.4894, 2013.
  38. Y. Zhang, Y. Cheng, X. Huang, F. Wen, R. Feng, Y. Li, and Y. Guo, “Simple and robust loss design for multi-label learning with missing labels,” arXiv preprint arXiv:2112.07368, 2021.
  39. Y. Kim, J. M. Kim, J. Jeong, C. Schmid, Z. Akata, and J. Lee, “Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 3408–3417.
  40. Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proceedings of the 26th annual international conference on machine learning, 2009, pp. 41–48.
  41. T. Chen, T. Pu, L. Liu, Y. Shi, Z. Yang, and L. Lin, “Heterogeneous Semantic Transfer for Multi-label Recognition with Partial Labels,” arXiv preprint arXiv:2205.11131, 2022.
  42. T. Pu, T. Chen, H. Wu, and L. Lin, “Semantic-aware representation blending for multi-label image recognition with partial labels,” in Proceedings of the AAAI conference on artificial intelligence, vol. 36, 2022, pp. 2091–2098, issue: 2.
  43. T. Pu, T. Chen, H. Wu, Y. Shi, Z. Yang, and L. Lin, “Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels,” 2023, _eprint: 2205.13092.
  44. L. Yan, R. H. Dodier, M. Mozer, and R. H. Wolniewicz, “Optimizing classifier performance via an approximation to the Wilcoxon-Mann-Whitney statistic,” in Proceedings of the 20th international conference on machine learning (icml-03), 2003, pp. 848–855.
  45. C. Cortes and M. Mohri, “AUC optimization vs. error rate minimization,” Advances in neural information processing systems, vol. 16, 2003.
  46. Q. Qi, Y. Luo, Z. Xu, S. Ji, and T. Yang, “Stochastic optimization of areas under precision-recall curves with provable convergence,” Advances in Neural Information Processing Systems, vol. 34, pp. 1752–1765, 2021.
  47. D. J. Montana, L. Davis, and others, “Training feedforward neural networks using genetic algorithms.” in IJCAI, vol. 89, 1989, pp. 762–767.
  48. J. N. Gupta and R. S. Sexton, “Comparing backpropagation with a genetic algorithm for neural network training,” Omega, vol. 27, no. 6, pp. 679–684, 1999, publisher: Elsevier.
  49. O. E. David and I. Greental, “Genetic algorithms for evolving deep neural networks,” in Proceedings of the Companion Publication of the 2014 Annual Conference on Genetic and Evolutionary Computation, 2014, pp. 1451–1452.
  50. H. H. Pham, T. T. Le, D. Q. Tran, D. T. Ngo, and H. Q. Nguyen, “Interpreting chest X-rays via CNNs that exploit hierarchical disease dependencies and uncertainty labels,” Neurocomputing, vol. 437, pp. 186–194, 2021, publisher: Elsevier.
  51. G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708.
  52. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition.   Ieee, 2009, pp. 248–255.
  53. C. F. Chong, X. Yang, W. Ke, and Y. Wang, “GAN-based Spatial Transformation Adversarial Method for Disease Classification on CXR Photographs by Smartphones,” in 2021 Digital Image Computing: Techniques and Applications (DICTA).   IEEE, 2021, pp. 01–08.
  54. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  55. A. F. Gad, “PyGAD: An Intuitive Genetic Algorithm Python Library,” 2021, _eprint: 2106.06158.
  56. Z. Guo, Y. Yan, Z. Yuan, and T. Yang, “Fast objective & duality gap convergence for nonconvex-strongly-concave min-max problems,” arXiv preprint arXiv:2006.06889, 2020.
  57. P. Jansson and others, “Multi-view automated chest radiography interpretation,” 2021.
  58. U. Kamal, M. Zunaed, N. B. Nizam, and T. Hasan, “Anatomy X-Net: A Semi-Supervised Anatomy Aware Convolutional Neural Network for Thoracic Disease Classification,” arXiv preprint arXiv:2106.05915, 2021.
  59. W. Ye, J. Yao, H. Xue, and Y. Li, “Weakly supervised lesion localization with probabilistic-cam pooling,” arXiv preprint arXiv:2005.14480, 2020.
  60. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
  61. E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, “Randaugment: Practical automated data augmentation with a reduced search space,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, 2020, pp. 702–703.
  62. L. N. Smith and N. Topin, “Super-convergence: Very fast training of neural networks using large learning rates,” in Artificial intelligence and machine learning for multi-domain operations applications, vol. 11006.   SPIE, 2019, pp. 369–386.
  63. Z.-M. Chen, X.-S. Wei, P. Wang, and Y. Guo, “Multi-label image recognition with graph convolutional networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5177–5186.
  64. T. Chen, L. Lin, R. Chen, X. Hui, and H. Wu, “Knowledge-guided multi-label few-shot learning for general image recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 44, no. 3, pp. 1371–1384, 2020, publisher: IEEE.
  65. Z. Chen, X.-S. Wei, P. Wang, and Y. Guo, “Learning graph convolutional networks for multi-label recognition and applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, publisher: IEEE.
  66. T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2980–2988.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Github Logo Streamline Icon: https://streamlinehq.com