Papers
Topics
Authors
Recent
Search
2000 character limit reached

Breaking Rank Bottlenecks in Knowledge Graph Embeddings

Published 27 Jun 2025 in cs.AI and cs.LG | (2506.22271v2)

Abstract: Many knowledge graph embedding (KGE) models for link prediction use powerful encoders. However, they often rely on a simple hidden vector-matrix multiplication to score subject-relation queries against candidate object entities. When the number of entities is larger than the model's embedding dimension, which is often the case in practice by several orders of magnitude, we have a linear output layer with a rank bottleneck. Such bottlenecked layers limit model expressivity. We investigate both theoretically and empirically how rank bottlenecks affect KGEs. We find that, by limiting the set of feasible predictions, rank bottlenecks hurt the ranking accuracy and distribution fidelity of scores. Inspired by the language modelling literature, we propose KGE-MoS, a mixture-based output layer to break rank bottlenecks in many KGEs. Our experiments show that KGE-MoS improves ranking performance of KGE models on large-scale datasets at a low parameter cost.

Summary

  • The paper proposes a novel mixture-based output layer that mitigates rank bottlenecks in knowledge graph embedding models for improved link prediction.
  • It employs a Mixture of Softmaxes formulation to enhance model expressivity with minimal increase in parameter costs.
  • Empirical evaluations on datasets like ogbl-biokg show significant gains in ranking accuracy and distribution fidelity over traditional methods.

Breaking Rank Bottlenecks in Knowledge Graph Embeddings

Overview

The paper introduces methods to overcome rank bottlenecks in knowledge graph embedding models for link prediction tasks. These bottlenecks arise when the number of entities exceeds the embedding dimension, limiting model expressivity. The authors propose a mixture-based output layer inspired by language modeling to alleviate these constraints, showing enhanced performance on large-scale datasets with minimal increase in parameter costs.

Knowledge Graph Completion Tasks

Knowledge Graph Completion (KGC) involves predicting missing triples in a knowledge graph (KG). For this, Knowledge Graph Embeddings (KGE) score entities given a subject-relation pair through vector-matrix multiplications, leading to potential rank bottlenecks. These issues impact three tasks:

  • Ranking Reconstruction (RR): Ensures true history triples receive higher scores than negative ones, typically using margin-based loss.
  • Sign Reconstruction (SR): Uses a binary classification on triples, aiming for positive scores for true triples.
  • Distributional Reconstruction (DR): Models require assignments of specified scores to true/false triples, demanding precise reproduction of probability distributions over triples.

Rank Bottlenecks in KGEs

Rank bottlenecks occur due to low-rank linear subspaces confining feasible predictions, affecting ranking accuracy and distribution fidelity. This bottleneck arises from linear constraints related to embeddings' dimensionality which is much lower than the number of entities, leading to the following limitations:

  • Distributional Fidelity: The adjacency matrices of real KGs typically exhibit higher ranks than the score matrices achievable by bottlenecked KGEs.
  • Task RR & SR: Bottlenecks limit expressibility, making accurate ranking and binary classification of entities infeasible for certain configurations. Figure 1

    Figure 1: KGE models are used in various prediction tasks, each with their own expressivity needs.

Overcoming Bottlenecks: Mixture of Softmaxes (MoS)

The proposed solution, Mixture of Softmaxes (MoS), breaks rank bottlenecks by utilizing multiple softmax components. This non-linear mixture can model a more complex set of distributions than a single softmax layer:

  • Formula: $P(O|s,r) = \sum_{k=1}^{K} \pi_k(\bh_{s,r})\, \text{softmax}(f_k(\bh_{s,r}) \bE^\top)$.
  • Component-specific Parameters: Extending output capabilities with additional components incurs low parameter costs, facilitating scalability without drastic dimensional increases.
  • Empirical Benefit: Extensive testing shows that models with MoS outperform traditional bottlenecked KGEs on large datasets while efficiently handling computation needs.

Implications and Future Directions

This approach especially shines when KGEs are challenged with large, dense knowledge graphs, as seen in the empirical results on datasets like ogbl-biokg. Future work could aim to refine the bounds on embedding dimensions further and explore the scalability of MoS in even larger KGs. Investigating other forms of output layers across different architectures beyond bilinear and neural networks would also be valuable.

Conclusion

The MoS layer provides a versatile tool for mitigating expressivity limitations due to rank bottlenecks in KGE models. By doing so, it offers practical improvements in modeling large-scale knowledge graphs with higher accuracy and in a computationally feasible manner.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 5 tweets with 4458 likes about this paper.