Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Building Contextual Knowledge Graphs for Personalized Learning Recommendations using Text Mining and Semantic Graph Completion (2401.13609v1)

Published 24 Jan 2024 in cs.IR

Abstract: Modelling learning objects (LO) within their context enables the learner to advance from a basic, remembering-level, learning objective to a higher-order one, i.e., a level with an application- and analysis objective. While hierarchical data models are commonly used in digital learning platforms, using graph-based models enables representing the context of LOs in those platforms. This leads to a foundation for personalized recommendations of learning paths. In this paper, the transformation of hierarchical data models into knowledge graph (KG) models of LOs using text mining is introduced and evaluated. We utilize custom text mining pipelines to mine semantic relations between elements of an expert-curated hierarchical model. We evaluate the KG structure and relation extraction using graph quality-control metrics and the comparison of algorithmic semantic-similarities to expert-defined ones. The results show that the relations in the KG are semantically comparable to those defined by domain experts, and that the proposed KG improves representing and linking the contexts of LOs through increasing graph communities and betweenness centrality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. U. Buchmann, Vocational-scientific education in qualification design – conditional factors for quality assurance and development of qualifications across educational sectors, in: V. Rein, J. Wildt (Eds.), Professional-scientific education: discourses, perspectives, implications and options for science and practice, Verlag Barbara Budrich, Opladen Berlin Toronto, 2022, pp. 341–371.
  2. The knowledge graph as the default data model for learning on heterogeneous knowledge, DS 1 (2017) 39–57. URL: https://content.iospress.com/articles/data-science/ds007. doi:10.3233/DS-170007.
  3. Assessment for learning using digital knowledge maps, in: D. Ifenthaler, R. Hanewald (Eds.), Digital Knowledge Maps in Education, Springer New York, New York, NY, 2014, pp. 221–237. doi:10.1007/978-1-4614-3178-7_12.
  4. A text extraction-based smart knowledge graph composition for integrating lessons learned during the microchip design, in: K. Arai, S. Kapoor, R. Bhatia (Eds.), Intelligent Systems and Applications, Springer International Publishing, Cham, 2021, pp. 594–610.
  5. Learning analytics to support teachers’ assessment of problem solving: A novel application for machine learning and graph algorithms, in: D. Ifenthaler, D.-K. Mah, J. Y.-K. Yau (Eds.), Utilizing Learning Analytics to Support Study Success, Springer International Publishing, 2019, pp. 175–199. doi:10.1007/978-3-319-64792-0_11.
  6. Educor: An educational and career-oriented recommendation ontology, in: The Semantic Web – ISWC 2021: 20th International Semantic Web Conference, ISWC 2021, Virtual Event, October 24–28, 2021, Proceedings, Springer-Verlag, Berlin, Heidelberg, 2021, p. 546–562. doi:10.1007/978-3-030-88361-4_32.
  7. K. Verbert, et al., Context-aware recommender systems for learning: A survey and future challenges, IEEE Trans. Learning Technol. 5 (2012) 318–335. doi:10.1109/TLT.2012.11.
  8. A. Visvizi, L. Daniela, Technology-enhanced learning and the pursuit of sustainability, Sustainability 11 (2019) 4022. doi:10.3390/su11154022.
  9. Y. M. Hemmler, D. Ifenthaler, Indicators of the learning context for supporting personalized and adaptive learning environments, in: 2022 International Conference on Advanced Learning Technologies (ICALT), IEEE, Bucharest, Romania, 2022, pp. 61–65. doi:10.1109/ICALT55010.2022.00026.
  10. Ontology-based framework for context-aware mobile learning, in: Proceeding of the 2006 international conference on Communications and mobile computing - IWCMC ’06, ACM Press, Vancouver, British Columbia, Canada, 2006, p. 1307. doi:10.1145/1143549.1143811.
  11. Designing the next generation of map assessment systems: Open questions and opportunities to automatically assess a student’s knowledge as a map, Journal of Research on Technology in Education (2022). doi:10.1080/15391523.2022.2119449.
  12. M. K. Kim, K. S. McCarthy, Using graph centrality as a global index to assess students’ mental model structure development during summary writing, Education Tech Research Dev 69 (2021) 971–1002. doi:10.1007/s11423-021-09942-1.
  13. Knowledge graphs in education and employability: A survey on applications and techniques, IEEE Access 10 (2022) 80174–80183. doi:10.1109/ACCESS.2022.3194063.
  14. Jobbert: Understanding job titles through skills, in: arXiv, 2021. doi:10.48550/arXiv.2109.09605.
  15. Learning job representation using directed graph embedding, in: Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data, ACM, Anchorage Alaska, 2019, pp. 1–5. doi:10.1145/3326937.3341263.
  16. A combined representation learning approach for better job and skill recommendation, in: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, ACM, Torino Italy, 2018, pp. 1997–2005. doi:10.1145/3269206.3272023.
  17. Job posting-enriched knowledge graph for skills-based matching, arXiv (2021). doi:10.48550/arXiv.2109.02554.
  18. Hybrid human-ai curriculum development for personalised informal learning environments, in: LAK22: 12th International Learning Analytics and Knowledge Conference, Association for Computing Machinery, New York, NY, USA, 2022, pp. 563–569. doi:10.1145/3506860.3506917.
  19. N. Reimers, I. Gurevych, Sentence-bert: Sentence embeddings using siamese bert-networks, arXiv (2019). doi:10.48550/arXiv.1908.10084.
  20. Assessing linked data mappings using network measures, in: E. Simperl, et al. (Eds.), The Semantic Web: Research and Applications, Springer Berlin Heidelberg, 2012, pp. 87–102. doi:10.1007/978-3-642-30284-8_13.
  21. A. Zaveri, et al., Quality assessment for linked data: A survey: A systematic literature review and conceptual framework, SW 7 (2016) 63–93. URL: https://content.iospress.com/articles/semantic-web/sw175. doi:10.3233/SW-150175.
  22. H. S. Shin, A. Jeong, Modeling the relationship between students’ prior knowledge, causal reasoning processes, and quality of causal maps, Computers & Education 163 (2021) 104113. doi:https://doi.org/10.1016/j.compedu.2020.104113.
  23. L. C. Freeman, Centrality in social networks: Conceptual clarification, Social Networks 1 (1978) 215–239. doi:10.1016/0378-8733(78)90021-7.
  24. A. E. Monge, C. Elkan, An efficient domain-independent algorithm for detecting approximately duplicate database records, in: Workshop on Research Issues on Data Mining and Knowledge Discovery, DMKD 1997 in cooperation with ACM SIGMOD’97, Tucson, Arizona, USA, 1997.
  25. U. Brandes, C. Pich, Centrality estimation in large networks, International Journal of Bifurcation Chaos 17 (2007) 2303–2318. doi:10.1142/S0218127407018403.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Hasan Abu-Rasheed (8 papers)
  2. Mareike Dornhöfer (2 papers)
  3. Christian Weber (15 papers)
  4. Gábor Kismihók (13 papers)
  5. Ulrike Buchmann (1 paper)
  6. Madjid Fathi (12 papers)
Citations (8)

Summary

The paper "Building Contextual Knowledge Graphs for Personalized Learning Recommendations using Text Mining and Semantic Graph Completion" addresses the limitations of hierarchical data models commonly used in digital learning platforms for enabling personalized learning. These hierarchical structures (e.g., curriculum -> course -> topic -> material) effectively represent the detail of a learning objective but struggle to capture the broader context of Learning Objects (LOs) and their relations across different domains or curricula, which is crucial for personalized recommendations beyond simple recall-level learning.

The core proposal is to transform these hierarchical data models into Knowledge Graphs (KGs) to better represent the context of LOs and facilitate personalized learning recommendations. The KG represents LOs as nodes and relations between them as edges. The transformation involves not only preserving the original hierarchical relations but also extracting new semantic relations between LOs based on their textual descriptions.

The practical implementation involves a customized Text Mining Pipeline (TMP) designed to analyze the textual metadata (titles and descriptions) of LOs, which can be in multiple languages (specifically English and German in this work). The TMP includes steps for:

  1. Language Detection: Identifying the language of the text using a pre-trained model.
  2. Text Cleaning: Removing special characters while preserving sentence structure.
  3. Topic Extraction: For longer description texts, KeyBERT is used to extract main topics. This step is not performed for shorter titles.
  4. Text Embedding: Generating vector representations (embeddings) for titles and extracted topics from descriptions using Sentence-BERT and the SpaCy library. For multilingual handling, German titles are translated to English using the DeepL API before embedding calculation to ensure comparability across languages.
  5. Semantic Similarity Calculation: Computing the cosine similarity between text embeddings (titles with titles, and topic sets from descriptions). For descriptions, an intersection matrix of topics determines which topic pairs are compared, and the final similarity is an average across relevant topic pairs.
  6. Relation Creation: Based on these similarity scores and experimentally defined, expert-tuned thresholds, a new type of relation, "has_semantic_relation_to", is created between LO nodes in the KG.

The KG is then constructed by integrating these newly mined semantic relations alongside the existing hierarchical relations (Journey-Course, Course-Topic, etc.). This creates a graph where LOs are nodes, and edges represent either hierarchical connections or semantic connections found through text analysis. This allows the KG to model cross-domain or cross-curriculum connections between LOs that share a similar context, even if they are not directly linked in the original hierarchy.

The evaluation assesses both the quality of the extracted semantic relations and the resulting KG structure.

  • Semantic Relation Evaluation: This is done quantitatively by comparing the semantic similarity scores of the newly extracted relations between LOs from different Journeys to the average semantic similarity within expert-curated Journeys. The assumption is that if relations between different Journeys have similar semantic scores as relations within Journeys, they are likely meaningful. The results showed that 79% of the mined semantic relations had similarity scores comparable to or higher than the average within-journey similarity, indicating their relevance.
  • KG Structure Evaluation: Standard graph quality metrics are used to compare the constructed KG with the original hierarchical model. The selected metrics, interpreted in the context of technology-enhanced learning (TEL) and vocational education and training (VET), include:
    • Average Degree Centrality (ADC): Increased from 1.079 (hierarchical) to 2.262 (KG), indicating better connectedness of LOs to other potential learning contexts.
    • Clustering Coefficient (CC): The number of communities increased from 253 to 541, suggesting more distinct learning contexts are represented. The average modularity decreased from 0.779 to 0.636, implying better inter-connectedness between these communities.
    • Weakly Connected Components (WCC): Reduced from 63 to 35, showing that fewer groups of LOs are isolated, improving overall graph connectivity.
    • Betweenness Centrality (BC): Increased significantly from 1.57 to 15.1, highlighting the emergence of nodes (LOs) that act as crucial bridges between different parts of the graph, which can represent transferrable skills or concepts connecting disparate learning goals.
  • Qualitative Expert Evaluation: Focus groups with VET and researcher-training experts validated the approach. Experts confirmed the importance of connecting learning goals and recognized KGs as valuable contextual models. They agreed that textual descriptions are rich sources for finding contextual relations. Feedback also highlighted the need for content creators to understand the underlying process to improve descriptions and the necessity for continuous KG updates due to the dynamic nature of learning domains and learner contexts.

In summary, the paper presents a practical approach to building contextual KGs for personalized learning by leveraging text mining on LO metadata. The method, including a multilingual TMP, successfully transforms rigid hierarchical structures into a more flexible and interconnected graph representation that better captures the contextual relationships between LOs. The evaluation demonstrates that the resulting KG exhibits structural properties conducive to personalized recommendations by enhancing connectivity and identifying bridging concepts.

Implementation considerations include the dependency on the quality and volume of existing textual data for LOs, the need for efficient processing of multilingual text, and managing potential redundancy when recommending multilingual content. Future work involves improving TMP robustness for sparse data and incorporating additional domain-specific features for richer contextualization.

X Twitter Logo Streamline Icon: https://streamlinehq.com