Papers
Topics
Authors
Recent
2000 character limit reached

LLM-empowered knowledge graph construction: A survey

Published 23 Oct 2025 in cs.AI | (2510.20345v1)

Abstract: Knowledge Graphs (KGs) have long served as a fundamental infrastructure for structured knowledge representation and reasoning. With the advent of LLMs, the construction of KGs has entered a new paradigm-shifting from rule-based and statistical pipelines to language-driven and generative frameworks. This survey provides a comprehensive overview of recent progress in LLM-empowered knowledge graph construction, systematically analyzing how LLMs reshape the classical three-layered pipeline of ontology engineering, knowledge extraction, and knowledge fusion. We first revisit traditional KG methodologies to establish conceptual foundations, and then review emerging LLM-driven approaches from two complementary perspectives: schema-based paradigms, which emphasize structure, normalization, and consistency; and schema-free paradigms, which highlight flexibility, adaptability, and open discovery. Across each stage, we synthesize representative frameworks, analyze their technical mechanisms, and identify their limitations. Finally, the survey outlines key trends and future research directions, including KG-based reasoning for LLMs, dynamic knowledge memory for agentic systems, and multimodal KG construction. Through this systematic review, we aim to clarify the evolving interplay between LLMs and knowledge graphs, bridging symbolic knowledge engineering and neural semantic understanding toward the development of adaptive, explainable, and intelligent knowledge systems.

Summary

  • The paper presents pioneering methods that utilize LLMs to transform traditional KG construction by integrating dynamic ontology engineering and adaptive frameworks.
  • It details LLM-driven extraction techniques and semantic fusion methods that enhance entity alignment and overcome limitations of rule-based approaches.
  • The survey outlines future directions including dynamic knowledge memory, multimodal KG construction, and novel reasoning substrates for robust AI systems.

LLM-Empowered Knowledge Graph Construction: A Technical Survey

Introduction

The paper "LLM-empowered knowledge graph construction: A survey" (2510.20345) provides a detailed examination of how LLMs are reshaping the landscape of Knowledge Graph (KG) construction. KGs are pivotal in knowledge representation, serving as a backbone for various intelligent applications. The paper delineates how the advent of LLMs has transitioned KG construction from traditional rule-based methods to more dynamic, adaptive frameworks empowered by natural language processing capabilities. Figure 1

Figure 1: Taxonomy of LLM for KGC

Traditional Knowledge Graph Construction Methodologies

Historically, KG construction followed a pipeline of ontology engineering, knowledge extraction, and knowledge fusion. Ontology engineering was heavily reliant on manual efforts, utilizing tools like Protégé for constructing domain ontologies. Knowledge extraction progressed from symbolic methods to leveraging neural architectures like BiLSTM-CRF for enhanced generalization. Knowledge fusion primarily focused on entity alignment through lexical and structural methods, though embedding-based approaches have evolved to address semantic heterogeneity and integration challenges.

The Role of LLMs in Ontology Engineering

LLMs have introduced transformative approaches in ontology engineering. The paper categorizes these into top-down methodologies, where LLMs act as co-modelers aiding in formal ontology construction, and bottom-up approaches, which utilize LLMs for inducing ontological schemas from data, enhancing LLM reasoning capabilities.

  1. Top-Down Methods: LLMs facilitate competency-question-based ontology generation, as seen in frameworks like Ontogenia, which utilize metacognitive prompting for structured ontology creation. Natural language-based ontology construction leverages LLMs to induce ontologies directly from text, bypassing traditional manual processes.
  2. Bottom-Up Methods: Here, data-driven approaches lead to the automatic derivation of ontological structures. This method supports dynamic schema evolution, a necessity for adapting to new and evolving knowledge domains.

LLM-Driven Knowledge Extraction

LLMs have enabled two major paradigms in knowledge extraction: schema-based and schema-free methods.

  • Schema-Based Extraction: Early approaches used fixed schemas for structured guidance. However, recent advancements advocate for dynamic schemas that evolve with data, as demonstrated by frameworks like AutoSchemaKG, which fosters scalable, open-domain knowledge extraction.
  • Schema-Free Extraction: This paradigm exploits LLMs' capabilities to derive knowledge without predefined schemas, emphasizing advanced prompt engineering and modular prompting to guide extraction processes.

LLM-Powered Knowledge Fusion

Knowledge fusion at the schema level aims to unify the structural backbone of KGs, while at the instance level, it deals with entity alignment and integration. The survey discusses the evolution from rigid ontology-driven fusion to LLM-enabled canonicalization, which promotes automated and semantically precise fusion processes.

  • Schema-Level Fusion: LLMs help align heterogeneous schemas into a consistent framework, as seen in the EDC framework, which supports self-alignment and cross-schema mapping.
  • Instance-Level Fusion: Contemporary approaches utilize LLMs for contextual reasoning and semantic discrimination, enhancing entity alignment precision through methodologies like LLM-Align and EntGPT.

Future Directions

The survey highlights several future research avenues:

  • Knowledge Graph-Based Reasoning for LLMs: There is a growing interest in leveraging KGs for enhancing LLM reasoning, enabling better interpretability and logical consistency.
  • Dynamic Knowledge Memory: KGs are envisioned as dynamic memory constructs within agentic systems, facilitating continuous learning and interaction.
  • Multimodal Knowledge Graph Construction: Efforts are directed toward integrating various data modalities into cohesive KGs, enhancing reasoning across different data inputs.
  • Beyond Retrieval-Augmented Generation (RAG): Future frameworks may explore KGs as interactive reasoning substrates, enhancing the robustness and explainability of generative models.

Conclusion

The paper effectively encapsulates the transition to LLM-driven frameworks that champion adaptability and integration of language understanding with structured reasoning. While significant strides have been made, challenges in scalability and reliability remain. Addressing these through innovative prompt design, multimodal integration, and advanced reasoning methodologies is crucial for the development of autonomous, explainable knowledge systems.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Authors (1)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 1 tweet with 1 like about this paper.

HackerNews