FLAME: Self-Supervised Low-Resource Taxonomy Expansion using Large Language Models (2402.13623v1)
Abstract: Taxonomies represent an arborescence hierarchical structure that establishes relationships among entities to convey knowledge within a specific domain. Each edge in the taxonomy signifies a hypernym-hyponym relationship. Taxonomies find utility in various real-world applications, such as e-commerce search engines and recommendation systems. Consequently, there arises a necessity to enhance these taxonomies over time. However, manually curating taxonomies with neoteric data presents challenges due to limitations in available human resources and the exponential growth of data. Therefore, it becomes imperative to develop automatic taxonomy expansion methods. Traditional supervised taxonomy expansion approaches encounter difficulties stemming from limited resources, primarily due to the small size of existing taxonomies. This scarcity of training data often leads to overfitting. In this paper, we propose FLAME, a novel approach for taxonomy expansion in low-resource environments by harnessing the capabilities of LLMs that are trained on extensive real-world knowledge. LLMs help compensate for the scarcity of domain-specific knowledge. Specifically, FLAME leverages prompting in few-shot settings to extract the inherent knowledge within the LLMs, ascertaining the hypernym entities within the taxonomy. Furthermore, it employs reinforcement learning to fine-tune the LLMs, resulting in more accurate predictions. Experiments on three real-world benchmark datasets demonstrate the effectiveness of FLAME in real-world scenarios, achieving a remarkable improvement of 18.5% in accuracy and 12.3% in Wu & Palmer metric over eight baselines. Furthermore, we elucidate the strengths and weaknesses of FLAME through an extensive case study, error analysis and ablation studies on the benchmarks.
- BoxE: A Box Embedding Model for Knowledge Base Completion. In Proceedings of the Thirty-Fourth Annual Conference on Advances in Neural Information Processing Systems (NeurIPS), Vol. 33. 9649–9661.
- A non-factoid question-answering taxonomy. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1196–1207.
- SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2). In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Association for Computational Linguistics, San Diego, California, 1081–1091. https://doi.org/10.18653/v1/S16-1168
- Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877–1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
- Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 485–495. https://doi.org/10.18653/v1/N18-1045
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
- A survey for in-context learning. arXiv preprint arXiv:2301.00234 (2022).
- Sabina Elkins and Ekaterina Kochmar. 2024. How Teachers Can Use Large Language Models and Bloom’s Taxonomy to Create Educational Quizzes. https://synthical.com/article/c4704b31-6966-4d21-9d94-861e483d3367. arXiv:2401.05914 [cs.AI]
- Unsupervised learning of an extensive and usable taxonomy for DBpedia. In Proceedings of the 11th International Conference on Semantic Systems (Vienna, Austria) (SEMANTICS ’15). Association for Computing Machinery, New York, NY, USA, 177–184. https://doi.org/10.1145/2814864.2814881
- Learning Semantic Hierarchies via Word Embeddings. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Baltimore, Maryland, 1199–1209. https://doi.org/10.3115/v1/P14-1113
- Andrew Hale and David Borys. 2013. Working to rule, or working safely? Part 1: A state of the art review. Safety science 55 (2013), 207–221.
- Marti A. Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th Conference on Computational Linguistics - Volume 2 (Nantes, France) (COLING ’92). Association for Computational Linguistics, USA, 539–545. https://doi.org/10.3115/992133.992154
- LoRA: Low-Rank Adaptation of Large Language Models. In International Conference on Learning Representations. https://openreview.net/forum?id=nZeVKeeFYf9
- TaxoEnrich: Self-Supervised Taxonomy Completion via Structure-Semantic Representations. In Proceedings of the ACM Web Conference 2022 (Virtual Event, Lyon, France) (WWW ’22). Association for Computing Machinery, New York, NY, USA, 925–934. https://doi.org/10.1145/3485447.3511935
- A Single Vector Is Not Enough: Taxonomy Expansion via Box Embeddings. In Proceedings of the ACM Web Conference 2023 (Austin, TX, USA) (WWW ’23). Association for Computing Machinery, New York, NY, USA, 2467–2476. https://doi.org/10.1145/3543507.3583310
- TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Online, 8489–8502. https://doi.org/10.18653/v1/2020.acl-main.751
- The Power of Scale for Parameter-Efficient Prompt Tuning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 3045–3059. https://doi.org/10.18653/v1/2021.emnlp-main.243
- Dekang Lin. 1998. An Information-Theoretic Definition of Similarity. In Proceedings of the Fifteenth International Conference on Machine Learning (ICML ’98). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 296–304.
- Carolyn Lipscomb. 2000. Medical subject headings (MeSH). Bulletin of the Medical Library Association 88 (08 2000), 265–6.
- Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv. 55, 9, Article 195 (jan 2023), 35 pages. https://doi.org/10.1145/3560815
- Automatic taxonomy construction from keywords. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Beijing, China) (KDD ’12). Association for Computing Machinery, New York, NY, USA, 1433–1441. https://doi.org/10.1145/2339530.2339754
- TEMP: Taxonomy Expansion with Dynamic Margin Loss through Taxonomy-Paths. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 3854–3863. https://doi.org/10.18653/v1/2021.emnlp-main.313
- AliCoCo: Alibaba E-commerce Cognitive Concept Net. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD ’20). Association for Computing Machinery, New York, NY, USA, 313–327. https://doi.org/10.1145/3318464.3386132
- Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Jian Su, Kevin Duh, and Xavier Carreras (Eds.). Association for Computational Linguistics, Austin, Texas, 403–413. https://doi.org/10.18653/v1/D16-1039
- Producing Usable Taxonomies Cheaply and Rapidly at Pinterest Using Discovered Dynamic μ𝜇\muitalic_μ-Topics. ArXiv abs/2301.12520 (2023). https://api.semanticscholar.org/CorpusID:256389882
- Expanding Taxonomies with Implicit Edge Semantics. In Proceedings of The Web Conference 2020 (Taipei, Taiwan) (WWW ’20). Association for Computing Machinery, New York, NY, USA, 2044–2054. https://doi.org/10.1145/3366423.3380271
- Octet: Online Catalog Taxonomy Enrichment with Self-Supervision. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 2247–2257. https://doi.org/10.1145/3394486.3403274
- PATTY: a taxonomy of relational patterns with semantic types. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (Jeju Island, Korea) (EMNLP-CoNLL ’12). Association for Computational Linguistics, USA, 1135–1145.
- TAXI at SemEval-2016 Task 13: a Taxonomy Induction Method based on Lexico-Syntactic Patterns, Substrings and Focused Crawling. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Association for Computational Linguistics, San Diego, California, 1320–1327. https://doi.org/10.18653/v1/S16-1206
- Emilian Pascalau and Clemens Rath. 2010. Managing business process variants at eBay. In Business Process Modeling Notation: Second International Workshop, BPMN 2010, Potsdam, Germany, October 13-14, 2010. Proceedings 2. Springer, 91–105.
- Stephen E Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In SIGIR’94: Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, organised by Dublin City University. Springer, 232–241.
- Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Association for Computational Linguistics, Melbourne, Australia, 358–363. https://doi.org/10.18653/v1/P18-2057
- Comprehension Based Question Answering using Bloom’s Taxonomy. In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), Anna Rogers, Iacer Calixto, Ivan Vulić, Naomi Saphra, Nora Kassner, Oana-Maria Camburu, Trapit Bansal, and Vered Shwartz (Eds.). Association for Computational Linguistics, Online, 20–28. https://doi.org/10.18653/v1/2021.repl4nlp-1.3
- Trust Region Policy Optimization. In Proceedings of the 32nd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 37), Francis Bach and David Blei (Eds.). PMLR, Lille, France, 1889–1897. https://proceedings.mlr.press/v37/schulman15.html
- Noam Shazeer and Mitchell Stern. 2018. Adafactor: Adaptive learning rates with sublinear memory cost. In International Conference on Machine Learning. PMLR, 4596–4604.
- TaxoExpan: Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network. In Proceedings of The Web Conference 2020 (Taipei, Taiwan) (WWW ’20). Association for Computing Machinery, New York, NY, USA, 486–497. https://doi.org/10.1145/3366423.3380132
- Learning Syntactic Patterns for Automatic Hypernym Discovery. In Advances in Neural Information Processing Systems, L. Saul, Y. Weiss, and L. Bottou (Eds.), Vol. 17. MIT Press. https://proceedings.neurips.cc/paper_files/paper/2004/file/358aee4cc897452c00244351e4d91f69-Paper.pdf
- Low-resource Taxonomy Enrichment with Pretrained Language Models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 2747–2758. https://doi.org/10.18653/v1/2021.emnlp-main.217
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).
- Enriching Taxonomies With Functional Domain Knowledge. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (Ann Arbor, MI, USA) (SIGIR ’18). Association for Computing Machinery, New York, NY, USA, 745–754. https://doi.org/10.1145/3209978.3210000
- Graph Attention Networks. In International Conference on Learning Representations. https://openreview.net/forum?id=rJXMpikCZ
- Denny Vrandečić. 2012. Wikidata: a new platform for collaborative data collection. In Proceedings of the 21st International Conference on World Wide Web (Lyon, France) (WWW ’12 Companion). Association for Computing Machinery, New York, NY, USA, 1063–1064. https://doi.org/10.1145/2187980.2188242
- SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Smaranda Muresan, Preslav Nakov, and Aline Villavicencio (Eds.). Association for Computational Linguistics, Dublin, Ireland, 5039–5059. https://doi.org/10.18653/v1/2022.acl-long.346
- Brendan Wallace and Alastair Ross. 2016. Beyond human error: taxonomies and safety science. CRC Press.
- A phrase mining framework for recursive construction of a topical hierarchy. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Chicago, Illinois, USA) (KDD ’13). Association for Computing Machinery, New York, NY, USA, 437–445. https://doi.org/10.1145/2487575.2487631
- Enquire One’s Parent and Child Before Decision: Fully Exploit Hierarchical Structure for Self-Supervised Taxonomy Expansion. In Proceedings of the Web Conference 2021 (Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 3291–3304. https://doi.org/10.1145/3442381.3449948
- Yue Wang and Shaofeng Zou. 2022. Policy Gradient Method For Robust Reinforcement Learning. In Proceedings of the 39th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 162), Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato (Eds.). PMLR, 23484–23526. https://proceedings.mlr.press/v162/wang22at.html
- Zhibiao Wu and Martha Palmer. 1994. Verb Semantics and Lexical Selection. In 32nd Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Las Cruces, New Mexico, USA, 133–138. https://doi.org/10.3115/981732.981751
- TaxoPrompt: A Prompt-based Generation Method with Taxonomic Context for Self-Supervised Taxonomy Expansion. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, Lud De Raedt (Ed.). International Joint Conferences on Artificial Intelligence Organization, 4432–4438. https://doi.org/10.24963/ijcai.2022/615 Main Track.
- STEAM: Self-Supervised Taxonomy Expansion with Mini-Paths. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 1026–1035. https://doi.org/10.1145/3394486.3403145
- TaxoGen: Unsupervised Topic Taxonomy Construction by Adaptive Term Embedding and Clustering. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (London, United Kingdom) (KDD ’18). Association for Computing Machinery, New York, NY, USA, 2701–2709. https://doi.org/10.1145/3219819.3220064
- Calibrate Before Use: Improving Few-shot Performance of Language Models. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 12697–12706. https://proceedings.mlr.press/v139/zhao21c.html
- Taxonomy-driven computation of product recommendations. In Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management (Washington, D.C., USA) (CIKM ’04). Association for Computing Machinery, New York, NY, USA, 406–415. https://doi.org/10.1145/1031171.1031252