Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification (2309.13543v1)
Abstract: Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or a set of well-defined constraints on the label space structure, such as hierarchical relations which may be complicated to provide as the number of labels increases. In this paper, we study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels. Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained LLM, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph, driven with a collective loss function that injects the information of expected label frequency and average multi-label cardinality of predictions. The experiments show that the proposed framework achieves effective performance under low supervision settings with almost imperceptible computational and memory overheads added to the usage of pre-trained LLM outperforming its initial performance by 70\% in terms of example-based F1 score.
- F. Benites and E. Sapozhnikova. Haram: A hierarchical aram neural network for large-scale text classification. In Proc. IEEE Int. Conf. Data Mining Workshop, pp. 847–854, 2015.
- Working with multilabel datasets in r: The mldr package. R J., 7:149, 2015.
- Signed graph convolutional network. In IEEE Int. Conf. Data Mining (ICDM), pp. 1066–1075, 2018.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proc. Association for Computational Linguistics (ACL), pp. 4171–4186, 2019.
- Towards open-domain topic classification. In Proc. NAACL - Human Language Technologies, pp. 90–98, 2022.
- Variational pretraining for semi-supervised text classification. In Proc. Association for Computational Linguistics (ACL), 2019.
- Hierarchical multi-label text classification: An attention-based recurrent network approach. In In. Proc. Conf. Information and Knowledge Management (CIKM), 2019.
- D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In Proc. Int. Conf. Learning Representations (ICLR), 2015.
- T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. In Proc. Int. Conf. Learning Representations (ICLR), 2017.
- Rcv1: A new benchmark collection for text categorization research. J. Machine Learning Research, 5:361–397, 2004.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proc. Association for Computational Linguistics (ACL), pp. 7871–7880, 2020.
- Deep learning for extreme multi-label text classification. In Proc. Int. ACM SIGIR Conf. Research and Development in Information Retrieval, pp. 115–124, 2017.
- Contextualized weak supervision for text classification. In Proc. Association for Computational Linguistics (ACL), pp. 323–333, 2020.
- Text classification using label names only: A language model self-training approach. In Proc. Conf. Empirical Methods Natural Language Processing (EMNLP), pp. 9006–9017, 2020.
- Maximizing subset accuracy with recurrent neural networks in multi-label classification. In Proc. Advances in Neural Information Processing Systems (NeurIPS), pp. 5413–5423, 2017.
- Multi-relation message passing for multi-label text classification. In Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), 2022.
- Scikit-learn: machine learning in python. J. Machine Learning Research, 12:2825–2830, 2011.
- Large-scale hierarchical text classification with recursively regularized deep graph-cnn. In Proc. World Wide Web Conf., 2018.
- Glove: Global vectors for word representation. In Proc. Conf. Empirical Methods Natural Language Processing (EMNLP), 2014.
- Few-shot and zero-shot multi-label learning for structured label spaces. In Proc. Conf. Empirical Methods Natural Language Processing (EMNLP), pp. 3132–3142, 2018.
- Taxoclass: Hierarchical multi-label text classification using only class names. In Proc. NAACL - Human Language Technologies, pp. 4239–4249, 2021.
- Generalized zero-shot text classification for icd coding. In Proc. Int. Joint Conference on Artificial Intelligence (IJCAI), pp. 4018–4024, 2020.
- G. Tsoumakas and I. Katakis. Multi-label classification: an overview. Int. J. Data Warehousing and Mining, 3:1–13, 2007.
- Manik Varma. Extreme classification: Tagging on wikipedia, recommendation on amazon and advertising on bing. In Proc. The Web Conference, pp. 1897, 2018.
- A broad-coverage challenge corpus for sentence understanding through inference. In Proc. Association for Computational Linguistics (ACL), 2018.
- Benchmarking zero-shot text classification: Datasets, evaluation and entailment approach. In Proc. Conf. Empirical Methods Natural Language Processing (EMNLP), pp. 3914–3923, 2019.
- Attentionxml: Label tree-based attention-aware deep model for high-performance extreme multi-label text classification. In Proc. Adv. Neural Information Processing Systems (NeurIPS), 2019.
- Weakly supervised text classification using supervision signals from a language model. In Findings of the Association for Computational Linguistics: NAACL, pp. 2295–2305, 2022.
- Integrating semantic knowledge to tackle zero-shot text classification. In Proc. NAACL - Human Language Technologies, pp. 1031–1040, 2019.
- Weakly-supervised text classification based on keyword graph. In Proc. Conf. Empirical Methods Natural Language Processing (EMNLP), pp. 2803–2813, 2021.
- M. Zhang and Z. Zhou. Ml-knn: A lazy learning approach to multi-label learning. Pattern Recognition, 40(7):2038–2048, 2007.
- Learn to adapt for generalized zero-shot text classification. In Proc. Association for Computational Linguistics (ACL), pp. 517–527, 2022.
- Hierarchy-aware global model for hierarchical text classification. In Proc. Association for Computational Linguistics (ACL), 2020.
- Muberra Ozmen (3 papers)
- Joseph Cotnareanu (4 papers)
- Mark Coates (75 papers)