- The paper introduces a novel framework that integrates LLM-derived item embeddings and heterogeneous knowledge graphs to learn stable user personas for improved session recommendations.
- The paper shows that KG-grounded persona embeddings significantly boost candidate recall by leveraging structured multi-relational data in sparse and cold-start scenarios.
- The paper validates the approach on Amazon datasets, demonstrating enhanced hit rates and ranking quality when combining long-term persona signals with session-specific models.
Persona-Driven Session-Based Recommendation via LLMs and Heterogeneous Knowledge Graphs
Motivation and Contributions
Session-based recommender systems (SBRS) are optimized for capturing transient user intent from short sequences of item interactions. However, these systems frequently assume session anonymity, which severely constrains their ability to deliver robust personalization—particularly in sparse and cold-start regimes where long-term behavioral data is unavailable or unreliable. Existing SBRS approaches largely rely on sequential modeling or input-side augmentation, inadequately separating enduring user characteristics from ephemeral preferences.
This paper presents a framework that addresses this gap by integrating heterogeneous knowledge graphs (KGs) and LLMs for user modeling. The central claim is that stable user personas—representing domain-specific affinities, stylistic or topical interests—can be inferred from the multi-relational structure of a KG which aggregates item interactions, item-attribute associations, and external metadata (e.g., DBpedia). The KG is initialized with LLM-derived item embeddings to encode fine-grained semantic signals, and user personas are learned via Heterogeneous Deep Graph Infomax (HDGI), enabling unsupervised capture of high-order relational semantics.
A two-stage architecture is proposed: (i) Personalized Information Extraction constructs the KG and learns persona embeddings; (ii) Personalized Information Utilization incorporates these persona representations, together with semantic item embeddings, into a retrieval architecture, followed by reranking with a base sequential model to emphasize short-term session-specific intent.
Figure 1: Overview of the persona-driven session-based recommendation framework leveraging the heterogeneous knowledge graph as user persona backbone.
Framework Overview
The pipeline begins with KG construction, integrating user-item interactions, item-to-item structural relations, item-feature links, and semantic metadata. Item and attribute nodes are encoded using LLMs (Qwen-family), while user embeddings are initialized randomly and refined through HDGI contrastive learning. The heterogeneous KG thus serves as the substrate for unsupervised persona inference, grounded in multi-hop relational context and textual semantics.
Personalized Information Extraction operates as follows:
Personalized Information Utilization integrates user persona embeddings with session context into a data-driven retriever, followed by reranking with SASRec to model session-specific intent. This hybrid architecture ensures that candidate retrieval is guided by stable, KG-grounded preferences while final item ranking emphasizes recent behavioral relevance.
Experimental Results
Experiments were conducted on Amazon Books and Amazon Movies datasets. The system consistently outperformed sequential baselines—even in highly sparse settings—when KG-derived persona embeddings were injected into the retrieval module. The observed improvement in Hit Rate (HR@100) demonstrates that the KG architecture enables retrieval of more relevant items, especially long-tail candidates connected via structured relations.
Notably, KG-based user preference embeddings drove substantial gains in candidate recall, although the impact on top-rank precision (HR@10) was attenuated in the high-sparsity Books dataset. This differentiation highlights how KG-grounded representations improve overall recall capability by leveraging structured knowledge, even if short-term session signals are weak.
After candidate generation, reranking with SASRec improved ranking quality (MRR, NDCG) by accurately modeling session-level dependencies, confirming the thesis that integration of stable persona signals with sequential modeling yields optimal performance across multiple recommendation metrics.
Theoretical and Practical Implications
The proposed KG-enhanced persona modeling addresses several limitations of prior SBRS systems:
- Interpretability: KG-grounded personas are structurally interpretable and transferable, contrasting with opaque embeddings from purely sequential models or LLM-only representations.
- Scalability: The method sidesteps the computational bottleneck of generating user-side representations exclusively with LLMs, leveraging unsupervised HDGI for efficient persona induction.
- Relational Grounding: Structured multi-relational links allow the model to generalize across cold-start scenarios and sparsity regimes, supporting robust retrieval of long-tail items.
The approach signals a shift from sequence-only SBRS toward unified architectures that intertwine structured knowledge and semantic representation learning, fundamentally increasing the robustness and personalization of modern recommender systems.
Future Directions
Potential extensions include:
- Dynamic Persona Evolution: Modeling temporal dynamics in KG-derived personas to reflect preference drift.
- Contextual KG Integration: Adapting KG construction to incorporate context-dependent attributes or external event signals.
- Multi-domain Generalization: Leveraging shared entity spaces and relations for cross-domain recommendation transfer.
- KG-Augmented CoT Reasoning: Using KG-augmented prompts to support chain-of-thought LLM reasoning for enhanced transparency and natural language explanations.
Conclusion
The paper introduces a hybrid approach for session-based recommendation, integrating heterogeneous knowledge graphs and LLM-derived item semantics to infer and utilize explicit user personas. KG-grounded personas consistently yield improved candidate retrieval and recommendation quality, particularly in sparse or anonymous session contexts. By bridging structured relational knowledge with sequence modeling, the framework advances the state-of-the-art for interpretable, scalable, and effective personalization (2604.06928).