HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification (2402.01696v2)
Abstract: Hierarchical text classification (HTC) is a complex subtask under multi-label text classification, characterized by a hierarchical label taxonomy and data imbalance. The best-performing models aim to learn a static representation by combining document and hierarchical label information. However, the relevance of document sections can vary based on the hierarchy level, necessitating a dynamic document representation. To address this, we propose HiGen, a text-generation-based framework utilizing LLMs to encode dynamic text representations. We introduce a level-guided loss function to capture the relationship between text and label name semantics. Our approach incorporates a task-specific pretraining strategy, adapting the LLM to in-domain knowledge and significantly enhancing performance for classes with limited examples. Furthermore, we present a new and valuable dataset called ENZYME, designed for HTC, which comprises articles from PubMed with the goal of predicting Enzyme Commission (EC) numbers. Through extensive experiments on the ENZYME dataset and the widely recognized WOS and NYT datasets, our methodology demonstrates superior performance, surpassing existing approaches while efficiently handling data and mitigating class imbalance. The data and code will be released publicly.
- Htlm: Hyper-text pre-training and prompting of language models. arXiv preprint arXiv:2107.06955.
- Multi-label learning with millions of labels: Recommending advertiser bid phrases for web pages. In Proceedings of the 22nd international conference on World Wide Web, pages 13–24.
- Hierarchical transfer learning for multi-label text classification. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6295–6300, Florence, Italy. Association for Computational Linguistics.
- Alan Bundy and Lincoln Wallen. 1984. Breadth-first search. Catalogue of artificial intelligence tools, pages 13–13.
- Brenda, the elixir core data resource in 2021: new developments and updates. Nucleic Acids Res, 49(D1):D498–d508.
- Hierarchy-aware label semantics matching network for hierarchical text classification. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4370–4379, Online. Association for Computational Linguistics.
- HTCInfoMax: A global model for hierarchical text classification via information maximization. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3259–3265, Online. Association for Computational Linguistics.
- Maayan Geffet and Ido Dagan. 2005. The distributional inclusion hypotheses and lexical entailment. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), pages 107–114.
- Siddharth Gopal and Yiming Yang. 2013. Recursive regularization for large-scale classification with hierarchical and graphical dependencies. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’13, page 257–265, New York, NY, USA. Association for Computing Machinery.
- Don’t stop pretraining: Adapt language models to domains and tasks. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8342–8360.
- Exploring label hierarchy in a generative way for hierarchical text classification. In Proceedings of the 29th International Conference on Computational Linguistics, pages 1116–1127.
- Exploiting global and local hierarchies for hierarchical text classification. arXiv preprint arXiv:2205.02613.
- Hdltex: Hierarchical deep learning for text classification. In 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pages 364–371.
- BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 7871–7880, Online. Association for Computational Linguistics.
- Hiercon: Hierarchical organization of technical documents based on concepts. In 2019 IEEE International Conference on Data Mining (ICDM), pages 379–388. IEEE.
- Linguistic knowledge and transferability of contextual representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1073–1094.
- Hierarchical text classification with reinforced label assignment. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 445–455, Hong Kong, China. Association for Computational Linguistics.
- Label semantic aware pre-training for few-shot text classification. arXiv preprint arXiv:2204.07128.
- Large-scale hierarchical text classification with recursively regularized deep graph-cnn. In Proceedings of the 2018 world wide web conference, pages 1063–1072.
- Hierarchical taxonomy-aware and attentional graph capsule rcnns for large-scale multi-label text classification. IEEE Transactions on Knowledge and Data Engineering, 33(6):2505–2519.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
- Hierarchical document classification as a sequence generation task. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, pages 147–155.
- Evan Sandhaus. 2008. The new york times annotated corpus. Linguistic Data Consortium, Philadelphia, 6(12):e26752.
- HFT-CNN: Learning hierarchical category structure for multi-label short text categorization. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 811–816, Brussels, Belgium. Association for Computational Linguistics.
- Cognitive structure learning model for hierarchical multi-label text classification. Knowledge-Based Systems, 218:106876.
- Incorporating hierarchy into text encoder: a contrastive learning approach for hierarchical text classification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7109–7119, Dublin, Ireland. Association for Computational Linguistics.
- HPT: Hierarchy-aware prompt tuning for hierarchical text classification. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3740–3751, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Hierarchical multi-label classification networks. In International conference on machine learning, pages 5075–5084. PMLR.
- Learning to learn and predict: A meta-learning approach for multi-label classification. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4354–4364, Hong Kong, China. Association for Computational Linguistics.
- Sgm: sequence generation model for multi-label classification. arXiv preprint arXiv:1806.04822.
- Constrained sequence-to-tree generation for hierarchical text classification. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 1865–1869.
- A multi-label classification method using a hierarchical and transparent representation for paper-reviewer recommendation. ACM Transactions on Information Systems (TOIS), 38(1):1–20.
- La-hcn: label-based attention for hierarchical multi-label text classification neural network. Expert Systems with Applications, 187:115922.
- Match: Metadata-aware text classification in a large hierarchy. In Proceedings of the Web Conference 2021, pages 3246–3257.
- Hierarchy-aware global model for hierarchical text classification. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 1106–1117, Online. Association for Computational Linguistics.
- Vidit Jain (4 papers)
- Mukund Rungta (6 papers)
- Yuchen Zhuang (37 papers)
- Yue Yu (343 papers)
- Zeyu Wang (137 papers)
- Mu Gao (2 papers)
- Jeffrey Skolnick (1 paper)
- Chao Zhang (907 papers)