KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World Knowledge (2405.16412v3)
Abstract: Knowledge Graph Embedding (KGE) techniques are crucial in learning compact representations of entities and relations within a knowledge graph, facilitating efficient reasoning and knowledge discovery. While existing methods typically focus either on training KGE models solely based on graph structure or fine-tuning pre-trained LLMs with classification data in KG, KG-FIT leverages LLM-guided refinement to construct a semantically coherent hierarchical structure of entity clusters. By incorporating this hierarchical knowledge along with textual information during the fine-tuning process, KG-FIT effectively captures both global semantics from the LLM and local semantics from the KG. Extensive experiments on the benchmark datasets FB15K-237, YAGO3-10, and PrimeKG demonstrate the superiority of KG-FIT over state-of-the-art pre-trained LLM-based methods, achieving improvements of 14.4%, 13.5%, and 11.9% in the Hits@10 metric for the link prediction task, respectively. Furthermore, KG-FIT yields substantial performance gains of 12.6%, 6.7%, and 17.7% compared to the structure-based base models upon which it is built. These results highlight the effectiveness of KG-FIT in incorporating open-world knowledge from LLMs to significantly enhance the expressiveness and informativeness of KG embeddings.
- Knowledge graph embedding based question answering. In Proceedings of the twelfth ACM international conference on web search and data mining, pages 105–113, 2019.
- Qa-gnn: Reasoning with language models and knowledge graphs for question answering. arXiv preprint arXiv:2104.06378, 2021.
- Natural language question/answering: Let users talk with the knowledge graph. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pages 217–226, 2017.
- Deep bidirectional language-knowledge graph pretraining. Advances in Neural Information Processing Systems, 35:37309–37323, 2022.
- Heterogeneous graph attention network. In The world wide web conference, pages 2022–2032, 2019.
- Knowledge graph convolutional networks for recommender systems. In The world wide web conference, pages 3307–3313, 2019.
- A comprehensive survey of knowledge graph-based recommender systems: Technologies, development, and contributions. Information, 12(6):232, 2021.
- Discovering protein drug targets using knowledge graph embeddings. Bioinformatics, 36(2):603–610, 08 2019.
- Bi-level contrastive learning for knowledge-enhanced molecule representations, 2024.
- Building a knowledge graph to enable precision medicine. Scientific Data, 10(1):67, 2023.
- Medml: fusing medical knowledge and machine learning models for early pediatric covid-19 hospitalization and severity prediction. Iscience, 25(9), 2022.
- Knowledge graph embeddings for icu readmission prediction. BMC Medical Informatics and Decision Making, 23(1):12, 2023.
- Graphcare: Enhancing healthcare predictions with personalized knowledge graphs. In The Twelfth International Conference on Learning Representations, 2024.
- Translating embeddings for modeling multi-relational data. In C.J. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc., 2013.
- Embedding entities and relations for learning and inference in knowledge bases. arXiv preprint arXiv:1412.6575, 2014.
- Complex embeddings for simple link prediction. In International conference on machine learning, pages 2071–2080. PMLR, 2016.
- Convolutional 2d knowledge graph embeddings. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
- Tucker: Tensor factorization for knowledge graph completion. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5185–5194, 2019.
- Rotate: Knowledge graph embedding by relational rotation in complex space. In International Conference on Learning Representations, 2018.
- Learning hierarchy-aware knowledge graph embeddings for link prediction. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 3065–3072, 2020.
- Modeling heterogeneous hierarchies with relation-specific hyperbolic cones. Advances in Neural Information Processing Systems, 34:12316–12327, 2021.
- Kg-bert: Bert for knowledge graph completion. arXiv preprint arXiv:1909.03193, 2019.
- Structure-augmented text representation learning for efficient knowledge graph completion. In Proceedings of the Web Conference 2021, pages 1737–1748, 2021.
- Knowledge is flat: A seq2seq generative framework for various knowledge graph completion. In Proceedings of the 29th International Conference on Computational Linguistics, pages 4005–4017, 2022.
- Sequence-to-sequence knowledge graph completion and question answering. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2814–2828, 2022.
- Language models as knowledge embeddings. arXiv preprint arXiv:2206.12617, 2022.
- Simkgc: Simple contrastive knowledge graph completion with pre-trained language models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4281–4294, 2022.
- Do pre-trained models benefit knowledge graph completion? a reliable evaluation and a reasonable approach. Association for Computational Linguistics, 2022.
- Text augmented open knowledge graph completion via pre-trained language models. In Findings of the Association for Computational Linguistics: ACL 2023, pages 11161–11180, 2023.
- A survey on knowledge graphs: Representation, acquisition, and applications. IEEE Transactions on Neural Networks and Learning Systems, 33(2):494–514, 2022.
- Exploring large language models for knowledge graph completion. arXiv preprint arXiv:2308.13916, 2023.
- Dipping plms sauce: Bridging structure and text for effective knowledge graph completion via conditional soft prompting. In Findings of the Association for Computational Linguistics: ACL 2023, pages 11489–11503, 2023.
- KICGPT: Large language model with knowledge in context for knowledge graph completion. In Houda Bouamor, Juan Pino, and Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, pages 8667–8683, Singapore, December 2023. Association for Computational Linguistics.
- Lpnl: Scalable link prediction with large language models. arXiv preprint arXiv:2401.13227, 2024.
- Daniel Müllner. Modern hierarchical, agglomerative clustering algorithms. arXiv preprint arXiv:1109.2378, 2011.
- K Krishna and M Narasimha Murty. Genetic k-means algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 29(3):433–439, 1999.
- Peter J. Rousseeuw. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65, 1987.
- On finding lowest common ancestors: Simplification and parallelization. SIAM Journal on Computing, 17(6):1253–1262, 1988.
- Sparsity and noise: Where knowledge graph embeddings fall short. In Martha Palmer, Rebecca Hwa, and Sebastian Riedel, editors, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1751–1756, Copenhagen, Denmark, September 2017. Association for Computational Linguistics.
- Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd workshop on continuous vector space models and their compositionality, pages 57–66, 2015.
- Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247–1250, 2008.
- Yago: a core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web, WWW ’07, page 697–706, New York, NY, USA, 2007. Association for Computing Machinery.
- Yago: a core of semantic knowledge. In Proceedings of the 16th international conference on World Wide Web, pages 697–706, 2007.
- Building a knowledge graph to enable precision medicine. Nature Scientific Data, 2023.
- Matryoshka representation learning. Advances in Neural Information Processing Systems, 35:30233–30249, 2022.
- George A Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41, 1995.
- Supervised contrastive learning. Advances in neural information processing systems, 33:18661–18673, 2020.
- Cross-lingual knowledge graph alignment via graph matching neural network. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3156–3161, 2019.
- Entity alignment between knowledge graphs using attribute embeddings. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 297–304, 2019.
- Multi-view knowledge graph embedding for entity alignment. arXiv preprint arXiv:1906.02390, 2019.
- G-retriever: Retrieval-augmented generation for textual graph understanding and question answering, 2024.
- Deep neural network-based relation extraction: an overview. Neural Computing and Applications, pages 1–21, 2022.
- Learning entity representation for entity disambiguation. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 30–34, 2013.