RotatE: Complex-Space Knowledge Graph Embeddings
- RotatE is a knowledge graph embedding model that represents entities and relations via complex rotations, effectively modeling symmetry, inversion, and composition.
- It employs a self-adversarial negative sampling mechanism to prioritize challenging examples, boosting training efficiency and prediction accuracy.
- Empirical evaluations on benchmarks like FB15k and WN18 demonstrate that RotatE achieves state-of-the-art performance while maintaining scalability.
RotatE is a knowledge graph embedding model that represents entities and relations in the complex vector space , where relations are modeled as rotations and entity–relation interactions are computed via element-wise (Hadamard) multiplication. RotatE is designed to capture key relational patterns in knowledge graphs—such as symmetry, antisymmetry, inversion, and composition—within a unified algebraic and geometric framework. The model incorporates a novel self-adversarial negative sampling mechanism to improve training efficiency and representation quality, and demonstrates state-of-the-art performance in standard link prediction benchmarks (Sun et al., 2019).
1. Mathematical Foundation
Let denote embeddings for the head and tail entities, and the embedding for a relation, with each coordinate constrained as . Each is thereby parameterized as , establishing as a vector of unit complex numbers (rotations). A true triple is modeled by enforcing , where denotes the Hadamard product.
The scoring function for a candidate triple is defined as the negative distance over complex coordinates:
This scoring captures how well the rotated head entity matches the tail, providing a basis for link prediction via ranking.
2. Modeling Relation Patterns
RotatE's geometric and algebraic structure enables explicit modeling of multiple relation patterns:
- Symmetry/Antisymmetry: For a relation , symmetry requires or equivalently for all ; thus, symmetric relations correspond to phases . If , antisymmetry is modeled. Lemma 1 formalizes: is symmetric iff for all , and antisymmetric otherwise.
- Inversion: For inverses and , the complex conjugate ensures inverse relation: . Lemma 2 shows RotatE represents inversion by conjugate embeddings.
- Composition: The element-wise product models composition: for as the composition of then , , implying . Lemma 3 formalizes composition via the Hadamard product.
These relational pattern properties are direct consequences of RotatE's embedding construction and underlying complex arithmetic.
3. Training Procedure and Self-Adversarial Negative Sampling
RotatE is trained to minimize a margin-based negative sampling loss, enhanced with a self-adversarial negative sampler:
The margin-based loss for a positive triple and a set of negative samples with distance is:
where is the sigmoid and weights the negatives adversarially:
with as a temperature hyperparameter. Harder negatives receive higher weights, focusing learning on challenging examples. Parameters are optimized with Adam, and no additional regularization is required aside from the unit-modulus constraint.
4. Empirical Performance and Evaluation
RotatE's performance is evaluated on link prediction benchmarks (FB15k, WN18, FB15k-237, WN18RR) and composition-pattern tasks (Countries S1–S3), with metrics including filtered Mean Rank (MR), Mean Reciprocal Rank (MRR), and Hits@K. Results under the filtered protocol are summarized:
| Dataset | MR | MRR | Hits@1 | Hits@10 |
|---|---|---|---|---|
| FB15k | 40 | 0.797 | 0.746 | 0.884 |
| WN18 | 309 | 0.949 | 0.944 | 0.959 |
| FB15k-237 | 177 | 0.338 | – | 0.533 |
| WN18RR | 3340 | 0.476 | – | 0.571 |
On FB15k-237 and WN18RR, RotatE surpasses ConvE and ComplEx in MRR and Hits@10. In composition-based probing (Countries S1–S3), RotatE achieves AUC-PR: 1.00 (S1), 1.00 (S2), 0.95 (S3), outperforming DistMult and ComplEx on longer composition chains.
Ablation studies reveal that removing self-adversarial negative sampling reduces MRR by approximately 3–4 points. The "pRotatE" baseline, with modulus fixed, establishes that variable moduli are essential for capturing compositionality.
5. Computational Complexity and Scalability
Per triple, RotatE requires operations (element-wise multiply and subtract). For a batch of size and negatives, a training step incurs computation. The memory footprint is , where and are the counts of entities and relations, respectively.
In practice, embedding dimension –$1000$ and number of negatives –$1024$ enable convergence in tens of epochs on standard hardware using Adam.
6. Distinguishing Advantages
RotatE's complex-space rotational approach yields several advantages:
- It captures symmetry without entity collapse, in contrast to TransE.
- Composition is handled via angle addition, not available in DistMult/ComplEx.
- The combination of phase and modulus enables simultaneous modeling of symmetry, inversion, and composition.
- The self-adversarial negative sampler accelerates learning by prioritizing difficult negative samples.
Through these mechanisms, RotatE provides a unified and algebraically expressive model that subsumes multiple relational phenomena within a simple geometric operation—rotation in —delivering consistent performance gains across major knowledge graph benchmarks (Sun et al., 2019).