Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Scalable Label Distribution Learning for Multi-Label Classification (2311.16556v2)

Published 28 Nov 2023 in cs.LG

Abstract: Multi-label classification (MLC) refers to the problem of tagging a given instance with a set of relevant labels. Most existing MLC methods are based on the assumption that the correlation of two labels in each label pair is symmetric, which is violated in many real-world scenarios. Moreover, most existing methods design learning processes associated with the number of labels, which makes their computational complexity a bottleneck when scaling up to large-scale output space. To tackle these issues, we propose a novel method named Scalable Label Distribution Learning (SLDL) for multi-label classification which can describe different labels as distributions in a latent space, where the label correlation is asymmetric and the dimension is independent of the number of labels. Specifically, SLDL first converts labels into continuous distributions within a low-dimensional latent space and leverages the asymmetric metric to establish the correlation between different labels. Then, it learns the mapping from the feature space to the latent space, resulting in the computational complexity is no longer related to the number of labels. Finally, SLDL leverages a nearest-neighbor-based strategy to decode the latent representations and obtain the final predictions. Extensive experiments illustrate that SLDL achieves very competitive classification performances with little computational consumption.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (81)
  1. J. Chu, H. Wang, J. Liu, Z. Gong, and T. Li, “Unsupervised feature learning architecture with multi-clustering integration RBM,” in Proceedings of the 39th IEEE International Conference on Data Engineering, 2023, pp. 3811–3812.
  2. R. Guan, H. Zhang, Y. Liang, F. Giunchiglia, L. Huang, and X. Feng, “Deep feature-based text clustering and its explanation,” in Proceedings of the 39th IEEE International Conference on Data Engineering, 2023, pp. 3871–3872.
  3. J. Song and B. Moon, “Decoupled instance-label extreme multi-label classification with skew coordinate feature space,” in Proceedings of the 37th IEEE International Conference on Data Engineering, 2021, pp. 1919–1924.
  4. X. Geng, “Label distribution learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 7, pp. 1734–1748, 2016.
  5. M. Zhang and Z. Zhou, “A review on multi-label learning algorithms,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 8, pp. 1819–1837, 2014.
  6. W. Liu, H. Wang, X. Shen, and I. W. Tsang, “The emerging trends of multi-label learning,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 11, pp. 7955–7974, 2022.
  7. Y. An, H. Xue, X. Zhao, and J. Wang, “From instance to metric calibration: A unified framework for open-world few-shot learning,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 8, pp. 9757–9773, 2023.
  8. Y. An, H. Xue, X. Zhao, and L. Zhang, “Conditional self-supervised learning for few-shot classification,” in Proceedings of the 30th International Joint Conference on Artificial Intelligence, 2021, pp. 2140–2146.
  9. Y. An, X. Zhao, and H. Xue, “Learning to learn from corrupted data for few-shot learning,” in Proceedings of the 32nd International Joint Conference on Artificial Intelligence, 2023, pp. 3423–3431.
  10. D. Zong and S. Sun, “BGNN-XML: bilateral graph neural networks for extreme multi-label text classification,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 7, pp. 6698–6709, 2023.
  11. J. Zhang, J. Ren, Q. Zhang, J. Liu, and X. Jiang, “Spatial context-aware object-attentional network for multi-label image classification,” IEEE Trans. Image Process., vol. 32, pp. 3000–3012, 2023.
  12. H. Lo, J. Wang, H. Wang, and S. Lin, “Cost-sensitive multi-label learning for audio tag annotation and retrieval,” IEEE Trans. Multim., vol. 13, no. 3, pp. 518–529, 2011.
  13. Z. Gao, P. Qiao, and Y. Dou, “HAAN: human action aware network for multi-label temporal action detection,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 5059–5069.
  14. H. Wang, Z. Li, J. Huang, P. Hui, W. Liu, T. Hu, and G. Chen, “Collaboration based multi-label propagation for fraud detection,” in Proceedings of the 29th International Joint Conference on Artificial Intelligence, 2020, pp. 2477–2483.
  15. J. Li, X. Zhu, and J. Wang, “Adaboost.c2: Boosting classifiers chains for multi-label classification,” in Proceedings of the 37th AAAI Conference on Artificial Intelligence, 2023, pp. 8580–8587.
  16. W. Gerych, T. Hartvigsen, L. Buquicchio, E. Agu, and E. A. Rundensteiner, “Recurrent bayesian classifier chains for exact multi-label classification,” in Advances in Neural Information Processing Systems 34, 2021, pp. 15 981–15 992.
  17. J. Read, B. Pfahringer, G. Holmes, and E. Frank, “Classifier chains for multi-label classification,” Mach. Learn., vol. 85, no. 3, pp. 333–359, 2011.
  18. P. Yang, X. Sun, W. Li, S. Ma, W. Wu, and H. Wang, “SGM: Sequence generation model for multi-label classification,” in Proceedings of the 27th International Conference on Computational Linguistics, 2018, pp. 3915–3926.
  19. J. Hang and M. Zhang, “Collaborative learning of label semantics and deep label-specific features for multi-label classification,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 12, pp. 9860–9871, 2022.
  20. C. Yeh, W. Wu, W. Ko, and Y. F. Wang, “Learning deep latent space for multi-label classification,” in Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017, pp. 2838–2844.
  21. J. Hang and M. Zhang, “Dual perspective of label-specific feature learning for multi-label classification,” in Proceedings of the 39th International Conference on Machine Learning, vol. 162, 2022, pp. 8375–8386.
  22. X. Zhao, Y. An, N. Xu, and X. Geng, “Fusion label enhancement for multi-label learning,” in Proceedings of the 31st International Joint Conference on Artificial Intelligence, 2022, pp. 3773–3779.
  23. R. Shao, N. Xu, and X. Geng, “Multi-label learning with label enhancement,” in Proceedings of the 2018 IEEE International Conference on Data Mining, 2018, pp. 437–446.
  24. N. Xu, C. Qiao, J. Lv, X. Geng, and M. Zhang, “One positive label is sufficient: Single-positive multi-label learning with label enhancement,” in Advances in Neural Information Processing Systems 35, 2022.
  25. X. Su, Z. You, D. Huang, L. Wang, L. Wong, B. Ji, and B. Zhao, “Biomedical knowledge graph embedding with capsule network for multi-label drug-drug interaction prediction,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 6, pp. 5640–5651, 2023.
  26. G. Xun, K. Jha, J. Sun, and A. Zhang, “Correlation networks for extreme multi-label text classification,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2020, pp. 1074–1082.
  27. F. Wang, S. Mizrachi, M. Beladev, G. Nadav, G. Amsalem, K. L. Assaraf, and H. H. Boker, “Mumic - multimodal embedding for multi-label image classification with tempered sigmoid,” in Proceedings of the 37th AAAI Conference on Artificial Intelligence, 2023, pp. 15 603–15 611.
  28. S. Huang, Y. Yu, and Z. Zhou, “Multi-label hypothesis reuse,” in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2012, pp. 525–533.
  29. J. Bao, Y. Wang, and Y. Cheng, “Asymmetry label correlation for multi-label learning,” Appl. Intell., vol. 52, no. 6, pp. 6093–6105, 2022.
  30. X. Zhao, Y. An, N. Xu, and X. Geng, “Variational continuous label distribution learning for multi-label text classification,” IEEE Transactions on Knowledge and Data Engineering, 2023.
  31. J. Hang, M. Zhang, Y. Feng, and X. Song, “End-to-end probabilistic label-specific feature learning for multi-label classification,” in Proceedings of the 36th AAAI Conference on Artificial Intelligence, 2022, pp. 6847–6855.
  32. M. R. Boutell, J. Luo, X. Shen, and C. M. Brown, “Learning multi-label scene classification,” Pattern Recognit., pp. 1757–1771, 2004.
  33. M. Zhang, Y. Li, X. Liu, and X. Geng, “Binary relevance for multi-label learning: an overview,” Frontiers Comput. Sci., vol. 12, no. 2, pp. 191–202, 2018.
  34. S. Khandagale, H. Xiao, and R. Babbar, “Bonsai: diverse and shallow trees for extreme multi-label classification,” Mach. Learn., vol. 109, no. 11, pp. 2099–2119, 2020.
  35. H. Jain, V. Balasubramanian, B. Chunduri, and M. Varma, “Slice: Scalable linear extreme classifiers trained on 100 million labels for related searches,” in Proceedings of the 12nd ACM International Conference on Web Search and Data Mining, 2019, pp. 528–536.
  36. A. Elisseeff and J. Weston, “A kernel method for multi-labelled classification,” in Advances in Neural Information Processing Systems 14, 2001, pp. 681–687.
  37. J. Fürnkranz, E. Hüllermeier, E. L. Mencía, and K. Brinker, “Multilabel classification via calibrated label ranking,” Mach. Learn., pp. 133–153, 2008.
  38. K. Bhatia, H. Jain, P. Kar, M. Varma, and P. Jain, “Sparse local embeddings for extreme multi-label classification,” in Advances in Neural Information Processing Systems 28, 2015, pp. 730–738.
  39. Y. Tagami, “Annexml: Approximate nearest neighbor search for extreme multi-label classification,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 455–464.
  40. B. Wang, L. Chen, W. Sun, K. Qin, K. Li, and H. Zhou, “Ranking-based autoencoder for extreme multi-label classification,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019, pp. 2820–2830.
  41. C. Guo, A. Mousavi, X. Wu, D. N. Holtmann-Rice, S. Kale, S. J. Reddi, and S. Kumar, “Breaking the glass ceiling for embedding-based classifiers for large output spaces,” in Advances in Neural Information Processing Systems 32, 2019, pp. 4944–4954.
  42. V. Gupta, R. Wadbude, N. Natarajan, H. Karnick, P. Jain, and P. Rai, “Distributional semantics meets multi-label learning,” in Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019, pp. 3747–3754.
  43. T. Wu, Q. Huang, Z. Liu, Y. Wang, and D. Lin, “Distribution-balanced loss for multi-label classification in long-tailed datasets,” in Proceedings of the European Conference on Computer Vision, 2020, pp. 162–178.
  44. T. Ridnik, E. B. Baruch, N. Zamir, A. Noy, I. Friedman, M. Protter, and L. Zelnik-Manor, “Asymmetric loss for multi-label classification,” in Proceedings of the IEEE International Conference on Computer Vision, 2021, pp. 82–91.
  45. T. Lin, P. Goyal, R. B. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2999–3007.
  46. Y. Huang, B. Giledereli, A. Köksal, A. Özgür, and E. Ozkirimli, “Balancing methods for multi-label text classification with long-tailed class distribution,” in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 8153–8161.
  47. Y. Cui, M. Jia, T. Lin, Y. Song, and S. J. Belongie, “Class-balanced loss based on effective number of samples,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 9268–9277.
  48. N. Zhang, X. Chen, X. Xie, S. Deng, C. Tan, M. Chen, F. Huang, L. Si, and H. Chen, “Document-level relation extraction as semantic segmentation,” in Proceedings of the 30th International Joint Conference on Artificial Intelligence, 2021, pp. 3999–4006.
  49. I. E. Yen, X. Huang, P. Ravikumar, K. Zhong, and I. S. Dhillon, “Pd-sparse : A primal and dual sparse approach to extreme multiclass and multilabel classification,” in Proceedings of the 33rd International Conference on Machine Learning, vol. 48, 2016, pp. 3069–3077.
  50. R. Babbar and B. Schölkopf, “Dismec: Distributed sparse machines for extreme multi-label classification,” in Proceedings of the 10th ACM International Conference on Web Search and Data Mining, 2017, pp. 721–729.
  51. Y. Prabhu and M. Varma, “Fastxml: a fast, accurate and stable tree-classifier for extreme multi-label learning,” in Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2014, pp. 263–272.
  52. W. Zhang, J. Yan, X. Wang, and H. Zha, “Deep extreme multi-label learning,” in Proceedings of the 2018 International Conference on Multimedia Retrieval, 2018, pp. 100–107.
  53. J. Liu, W. Chang, Y. Wu, and Y. Yang, “Deep learning for extreme multi-label text classification,” in Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp. 115–124.
  54. R. You, Z. Zhang, Z. Wang, S. Dai, H. Mamitsuka, and S. Zhu, “Attentionxml: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification,” in Advances in Neural Information Processing Systems 32, 2019, pp. 5812–5822.
  55. X. Zhao, Y. An, N. Xu, J. Wang, and X. Geng, “Imbalanced label distribution learning,” in Proceedings of the 37th AAAI Conference on Artificial Intelligence, 2023, pp. 11 336–11 344.
  56. X. Zhao, L. Qi, Y. An, and X. Geng, “Generalizable label distribution learning,” in Proceedings of the 31st ACM International Conference on Multimedia, 2023, pp. 8932–8941.
  57. K. Su and X. Geng, “Soft facial landmark detection by label distribution learning,” in Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019, pp. 5008–5015.
  58. B. Gao, H. Zhou, J. Wu, and X. Geng, “Age estimation using expectation of label distribution learning,” in Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018, pp. 712–718.
  59. X. Geng and Y. Xia, “Head pose estimation based on multivariate label distribution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1837–1842.
  60. Z. Huo and X. Geng, “Ordinal zero-shot learning,” in Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017, pp. 1916–1922.
  61. D. Zhou, X. Zhang, Y. Zhou, Q. Zhao, and X. Geng, “Emotion distribution learning from texts,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016, pp. 638–647.
  62. A. L. Berger, S. A. Della Pietra, and V. J. Della Pietra, “A maximum entropy approach to natural language processing,” Computational Linguistics, vol. 22, no. 1, pp. 39–71, 1996.
  63. S. D. Pietra, V. J. D. Pietra, and J. D. Lafferty, “Inducing features of random fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 4, pp. 380–393, 1997.
  64. X. Yang, X. Geng, and D. Zhou, “Sparsity conditional energy label distribution learning for age estimation,” in Proceedings of the 25th International Joint Conference on Artificial Intelligence, 2016, pp. 2259–2265.
  65. X. Geng and P. Hou, “Pre-release prediction of crowd opinion on movies by label distribution learning,” in Proceedings of the 24th International Joint Conference on Artificial Intelligence, 2015, pp. 3511–3517.
  66. W. Shen, K. Zhao, Y. Guo, and A. L. Yuille, “Label distribution learning forests,” in Advances in Neural Information Processing Systems 30, 2017, pp. 834–843.
  67. B. Gao, C. Xing, C. Xie, J. Wu, and X. Geng, “Deep label distribution learning with label ambiguity,” IEEE Transactions on Image Processing, vol. 26, no. 6, pp. 2825–2838, 2017.
  68. Y. Zhou, H. Xue, and X. Geng, “Emotion distribution recognition from facial expressions,” in Proceedings of the 23rd ACM International Conference on Multimedia, 2015, pp. 1247–1250.
  69. X. Jia, W. Li, J. Liu, and Y. Zhang, “Label distribution learning by exploiting label correlations,” in Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018, pp. 3310–3317.
  70. M. Xu and Z. Zhou, “Incomplete label distribution learning,” in Proceedings of the 26th International Joint Conference on Artificial Intelligence, 2017, pp. 3175–3181.
  71. P. Zhao and Z. Zhou, “Label distribution learning by optimal transport,” in Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018, pp. 4506–4513.
  72. T. Ren, X. Jia, W. Li, L. Chen, and Z. Li, “Label distribution learning with label-specific features,” in Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019, pp. 3318–3324.
  73. X. Jia, X. Zheng, W. Li, C. Zhang, and Z. Li, “Facial emotion distribution learning by exploiting low-rank label correlations locally,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 9841–9850.
  74. X. Zheng, X. Jia, and W. Li, “Label distribution learning by exploiting sample correlations locally,” in Proceedings of the 32nd AAAI Conference on Artificial Intelligence, 2018, pp. 4556–4563.
  75. X. Jia, Z. Li, X. Zheng, W. Li, and S. Huang, “Label distribution learning with label correlations on local samples,” IEEE Trans. Knowl. Data Eng., vol. 33, no. 4, pp. 1619–1631, 2021.
  76. T. Ren, X. Jia, W. Li, and S. Zhao, “Label distribution learning with label correlations via low-rank approximation,” in Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019, pp. 3325–3331.
  77. X. Zhao, Y. An, N. Xu, and X. Geng, “Continuous label distribution learning,” Pattern Recognit., vol. 133, p. 109056, 2023.
  78. R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu, “A limited memory algorithm for bound constrained optimization,” SIAM J. Sci. Comput., vol. 16, no. 5, pp. 1190–1208, 1995.
  79. K. Huang and H. Lin, “Cost-sensitive label embedding for multi-label classification,” Mach. Learn., vol. 106, no. 9-10, pp. 1725–1746, 2017.
  80. H. Jain, Y. Prabhu, and M. Varma, “Extreme multi-label loss functions for recommendation, tagging, ranking & other missing label applications,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 935–944.
  81. J. Demsar, “Statistical comparisons of classifiers over multiple data sets,” J. Mach. Learn. Res., vol. 7, pp. 1–30, 2006.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Xingyu Zhao (61 papers)
  2. Yuexuan An (2 papers)
  3. Lei Qi (84 papers)
  4. Xin Geng (90 papers)

Summary

We haven't generated a summary for this paper yet.