- The paper presents a novel framework combining translational models with hyperplane projections to jointly model recommendation and KG completion.
- It introduces TUP for preference induction and KTUP to enhance embeddings by integrating KG information, improving explainability in recommendations.
- Experimental results on benchmark datasets demonstrate KTUP's effectiveness in handling N-to-N relations, particularly in sparse recommendation scenarios.
This paper, "Unifying Knowledge Graph Learning and Recommendation: Towards a Better Understanding of User Preferences" (1902.06236), addresses the challenge of effectively leveraging Knowledge Graphs (KGs) to improve recommender systems, particularly in the presence of incomplete KGs. The authors propose a novel approach that jointly models item recommendation and KG completion, arguing that these two tasks can mutually enhance each other.
The core idea is to transfer information between the two tasks: structural facts from the KG help enrich user-item interaction modeling, while insights into user preferences derived from interactions can aid in predicting missing facts in the KG. Existing KG-based recommendation methods often treat KGs as static side information or transfer knowledge at a shallow level (e.g., simple embedding sharing), ignoring the dynamic and incomplete nature of KGs and the crucial role of relations in understanding user preferences.
To address this, the paper introduces two models:
- Translation-based User Preference model (TUP): This model focuses on item recommendation and is inspired by translational models in KG representation learning (like TransE or TransH). Instead of modeling the user-item relationship directly, TUP models user preferences as translational relations between users and items in a latent vector space. The intuition is that a user's preference for an item can be represented as a translation vector p such that u+p≈i.
- Preference Induction: TUP introduces a set of latent preference factors P. For a given user-item pair (u,i), the model induces which preferences from P are relevant. Two strategies are proposed:
- Hard Strategy: Selects a single, most prominent preference using a mechanism like Straight-Through Gumbel SoftMax.
- Soft Strategy: Combines multiple preferences using an attention mechanism, weighting each preference based on its relevance to the (u,i) pair.
- Hyperplane-based Translation: To handle the "N-to-N" issue (e.g., multiple users liking the same item for different reasons, or one user liking multiple items for the same reason), inspired by TransH, TUP projects user and item embeddings onto a preference-specific hyperplane. The translation is then calculated in this projected space: g(u,i;p)=∥u⊥+p−i⊥∥, where u⊥ and i⊥ are the projected vectors. The projection vector wp is associated with the induced preference p.
- TUP is trained using a BPR loss function to encourage preferred items to have lower scores than non-preferred items for a given user and induced preference.
- Knowledge-enhanced TUP (KTUP): This model extends TUP by incorporating KG knowledge. KTUP jointly trains the TUP recommendation task and a KG completion task (specifically, using TransH as the KG completion component in this work).
- Knowledge Enhancement: KTUP enhances the user, item, preference, and projection embeddings by combining them with the corresponding KG entity and relation embeddings. For aligned items i and entities e, the item embedding used in the translation is enhanced: i^=i+e. Similarly, based on a predefined mapping between user preferences and KG relations, the preference translation vector and projection vector are enhanced: p^=p+r and w^p=wp+wr. The scoring function g(u,i;p) in KTUP uses these enhanced embeddings.
- Joint Training: KTUP optimizes a combined objective function L=λLp+(1−λ)Lk, where Lp is the recommendation loss (BPR based on the enhanced TUP model) and Lk is the KG completion loss (margin-based ranking loss for TransH). The hyperparameter λ balances the influence of the two tasks.
Practical Implementation and Application:
- Data Requirements: Implementing KTUP requires three types of data: user-item interaction data (implicit feedback), a knowledge graph, and item-to-entity alignments linking items in the interaction data to entities in the KG. Data preprocessing is necessary to filter low-frequency users/items/entities and potentially map similar KG relations.
- Model Architecture: The implementation would involve building a model with separate embedding layers for users, items, preferences, entities, and KG relations. It would need components for preference induction (softmax with Gumbel trick or attention), hyperplane projection logic, and scoring functions for both recommendation and KG completion. The embedding enhancement step combines the item and preference embeddings with their KG counterparts.
- Training Details: The model is trained end-to-end using stochastic gradient descent (e.g., Adam or Adagrad) to minimize the combined loss. Negative sampling is crucial for both the BPR loss (sampling non-interacted items) and the KG completion loss (corrupting positive triplets). Hyperparameters like embedding size, learning rates, regularization coefficients, the margin for KG loss, the temperature for Gumbel Softmax (if used), and the balancing weight λ need to be tuned on a validation set. The predefined mapping between preferences and relations also needs to be established.
- Computational Considerations: Training involves learning embeddings for potentially large numbers of users, items, entities, and relations. The dimensionality of embeddings and the size of the datasets directly impact memory and computation time. Hyperplane projections add some computational overhead compared to simpler translational models.
- Explainability: A key practical benefit is explainability. By aligning preferences with KG relations (e.g., 'isDirectorOf', 'starring'), KTUP can provide reasons for recommendations. For instance, it can explain that an item (movie) is recommended because the user has shown a strong preference for the director (relation) associated with that movie (entity).
- Performance and Trade-offs: Experiments on MovieLens-1m and DBbook2014 datasets demonstrate KTUP's effectiveness in both recommendation and KG completion compared to several baselines, including other KG-based methods like CFKG, CKE, and CoFM. KTUP shows particular strength in handling complex N-to-N relations in the KG and benefits from joint training, showing a strong correlation between the training progress of the two tasks. The benefit of KG incorporation is shown to be more pronounced for sparser recommendation datasets, although it still provides improvements on denser ones. TUP (without KG) performs well on its own, especially with sufficient data, suggesting that the translational preference modeling is effective even without external knowledge.
The paper makes its project code available at \url{https://github.com/TaoMiner/joint-kg-recommender}, which is a significant practical contribution for researchers and practitioners looking to implement and experiment with this approach.
Future directions suggested by the authors include exploring more complex user preferences involving multi-hop relations in the KG and leveraging KG reasoning techniques (like rule mining) to address the cold-start problem for users or items not well-represented in the interaction data or KG alignments.