- The paper introduces a collaborative LLM and human validation approach to complete knowledge graphs for personalized curriculum modeling.
- It outlines a detailed ontology with curriculum, domain, and user models that structure educational content and enhance semantic linking.
- Results show high extraction F1 scores (≈0.98) and improved graph metrics, supporting effective personalized learning recommendations.
Use of LLMs in Knowledge Graph Completion for Personalized Education
Methodology Overview
The paper proposes a methodology using LLMs to enhance the personalization of higher education by completing Knowledge Graphs (KGs) for curriculum and domain modeling. The primary objective is to address the challenges in personalizing learning paths by constructing a detailed, interconnected model of university courses and domains. This involves a collaborative process in which human experts and LLMs work together to extract and classify educational content into a structured knowledge base.
Ontology Definition
The approach begins with defining a specialized ontology composed of three main models: the curriculum model, the domain model, and the user model. Each of these models plays a crucial role in accurately representing the educational landscape. The curriculum model breaks down educational content into topics and sub-topics, facilitating granular representation, while the domain model categorizes these topics into broader knowledge areas. Lastly, the user model captures individual learner profiles, ensuring recommendations are tailored to user-specific contexts.
Figure 1: Proposed ontological structure of the knowledge graph, with domain, curriculum, and user models.
Content Extraction and Human-AI Collaboration
A central component of the methodology is the LLM-driven pipeline for content extraction, which includes transcription, classification, and integration into the KG. This process leverages OpenAI's Whisper model for text extraction from lecture materials and employs GPT-4 for topic classification. The pipeline emphasizes the role of human validation, ensuring that the outputs are academically sound. Teachers refine machine-generated results, thereby maintaining educational quality while benefiting from technological efficiencies.
Figure 2: Pipeline components for the transcription, extraction, classification, and KG construction, based on a human-AI collaborative approach.
Semantic Linking and Knowledge Graph Construction
Once vetted by human oversight, the educational content is incorporated into the KG. This includes creating nodes for topics and establishing semantic relations that mirror academic linkages across courses. The LLM also generates these relations, boosting the graph's connectivity while retaining thematic relevance. Semantic similarity algorithms further enhance this through NLP, assessing the relational depth between topics across different lectures.
Figure 3: The structure of the KG before (left) and after (right) connecting both evaluation modules through semantic Topic and Sub-Topic relations.
Evaluation and Results
The methodology was evaluated using content from two university modules: Embedded Systems and Development of Embedded Systems Using FPGA. Expert evaluations underscored the precision and recall of the content extraction process, revealing F1 scores close to 0.98 for most categories, indicative of robust extraction capabilities. Despite its limited sample size, the KG showed notable structural improvements, confirmed by enhanced degree centrality and reduced modularity, suggesting stronger content integrations and better learning path recommendations.
Conclusion
The paper illustrates the potential of LLMs in transforming educational content organization and personalization. By combining AI-driven content extraction with human validation, it bridges efficiency with accuracy, catering to diverse educational needs. The methodology not only refines curricular presentation but enhances pedagogical flexibility by enabling instructors to deliver highly personalized learning experiences. Looking forward, expanding this framework to integrate larger datasets and diverse academic disciplines will further its impact on education personalization.