Papers
Topics
Authors
Recent
2000 character limit reached

RotatE: Complex-Space Knowledge Graph Embeddings

Updated 24 December 2025
  • RotatE is a knowledge graph embedding model that represents entities and relations via complex rotations, effectively modeling symmetry, inversion, and composition.
  • It employs a self-adversarial negative sampling mechanism to prioritize challenging examples, boosting training efficiency and prediction accuracy.
  • Empirical evaluations on benchmarks like FB15k and WN18 demonstrate that RotatE achieves state-of-the-art performance while maintaining scalability.

RotatE is a knowledge graph embedding model that represents entities and relations in the complex vector space Ck\mathbb{C}^k, where relations are modeled as rotations and entity–relation interactions are computed via element-wise (Hadamard) multiplication. RotatE is designed to capture key relational patterns in knowledge graphs—such as symmetry, antisymmetry, inversion, and composition—within a unified algebraic and geometric framework. The model incorporates a novel self-adversarial negative sampling mechanism to improve training efficiency and representation quality, and demonstrates state-of-the-art performance in standard link prediction benchmarks (Sun et al., 2019).

1. Mathematical Foundation

Let eh,etCk\mathbf{e}_h, \mathbf{e}_t \in \mathbb{C}^k denote embeddings for the head and tail entities, and rCk\mathbf{r} \in \mathbb{C}^k the embedding for a relation, with each coordinate constrained as ri=1|r_i|=1. Each rir_i is thereby parameterized as ri=eiθr,ir_i = e^{i\theta_{r,i}}, establishing r\mathbf{r} as a vector of unit complex numbers (rotations). A true triple (h,r,t)(h, r, t) is modeled by enforcing ehret\mathbf{e}_h \circ \mathbf{r} \approx \mathbf{e}_t, where \circ denotes the Hadamard product.

The scoring function for a candidate triple is defined as the negative L1L_1 distance over complex coordinates:

fr(h,t)=ehret1f_r(h,t) = -\left\| \mathbf{e}_h \circ \mathbf{r} - \mathbf{e}_t \right\|_1

This scoring captures how well the rotated head entity matches the tail, providing a basis for link prediction via ranking.

2. Modeling Relation Patterns

RotatE's geometric and algebraic structure enables explicit modeling of multiple relation patterns:

  • Symmetry/Antisymmetry: For a relation rr, symmetry requires ri2=1r_i^2=1 or equivalently ri=±1r_i = \pm 1 for all ii; thus, symmetric relations correspond to phases θr,i{0,π}\theta_{r,i}\in\{0, \pi\}. If rr1r \circ r \ne \mathbf{1}, antisymmetry is modeled. Lemma 1 formalizes: rr is symmetric iff ri=±1r_i=\pm1 for all ii, and antisymmetric otherwise.
  • Inversion: For inverses r1r_1 and r2r_2, the complex conjugate ensures inverse relation: r2=r1=r11\mathbf{r}_2 = \overline{\mathbf{r}_1} = \mathbf{r}_1^{-1}. Lemma 2 shows RotatE represents inversion by conjugate embeddings.
  • Composition: The element-wise product models composition: for r3r_3 as the composition of r1r_1 then r2r_2, r3=r1r2\mathbf{r}_3 = \mathbf{r}_1 \circ \mathbf{r}_2, implying θ3,i=θ1,i+θ2,i(mod2π)\theta_{3,i} = \theta_{1,i} + \theta_{2,i} \pmod{2\pi}. Lemma 3 formalizes composition via the Hadamard product.

These relational pattern properties are direct consequences of RotatE's embedding construction and underlying complex arithmetic.

3. Training Procedure and Self-Adversarial Negative Sampling

RotatE is trained to minimize a margin-based negative sampling loss, enhanced with a self-adversarial negative sampler:

The margin-based loss for a positive triple (h,r,t)(h, r, t) and a set of nn negative samples {(hi,r,ti)}\{(h_i', r, t_i')\} with distance dr(h,t)=ehretd_r(h, t) = \|\mathbf{e}_h \circ \mathbf{r} - \mathbf{e}_t\| is:

L=logσ(γdr(h,t))i=1np(hi,r,ti)logσ(dr(hi,ti)γ)L = -\log \sigma(\gamma - d_r(h,t)) - \sum_{i=1}^n p(h_i', r, t_i') \log \sigma(d_r(h_i', t_i') - \gamma)

where σ\sigma is the sigmoid and p(h,r,t)p(h',r,t') weights the negatives adversarially:

p(h,r,t)=exp(αfr(h,t))j=1nexp(αfr(hj,tj))p(h',r,t') = \frac{\exp(\alpha f_r(h',t'))}{\sum_{j=1}^n \exp(\alpha f_r(h_j',t_j'))}

with α\alpha as a temperature hyperparameter. Harder negatives receive higher weights, focusing learning on challenging examples. Parameters are optimized with Adam, and no additional regularization is required aside from the unit-modulus constraint.

4. Empirical Performance and Evaluation

RotatE's performance is evaluated on link prediction benchmarks (FB15k, WN18, FB15k-237, WN18RR) and composition-pattern tasks (Countries S1–S3), with metrics including filtered Mean Rank (MR), Mean Reciprocal Rank (MRR), and Hits@K. Results under the filtered protocol are summarized:

Dataset MR MRR Hits@1 Hits@10
FB15k 40 0.797 0.746 0.884
WN18 309 0.949 0.944 0.959
FB15k-237 177 0.338 0.533
WN18RR 3340 0.476 0.571

On FB15k-237 and WN18RR, RotatE surpasses ConvE and ComplEx in MRR and Hits@10. In composition-based probing (Countries S1–S3), RotatE achieves AUC-PR: 1.00 (S1), 1.00 (S2), 0.95 (S3), outperforming DistMult and ComplEx on longer composition chains.

Ablation studies reveal that removing self-adversarial negative sampling reduces MRR by approximately 3–4 points. The "pRotatE" baseline, with modulus fixed, establishes that variable moduli are essential for capturing compositionality.

5. Computational Complexity and Scalability

Per triple, RotatE requires O(k)O(k) operations (element-wise multiply and subtract). For a batch of size BB and nn negatives, a training step incurs O(Bnk)O(B n k) computation. The memory footprint is O((E+R)k)O((|\mathcal{E}| + |\mathcal{R}|) k), where E|\mathcal{E}| and R|\mathcal{R}| are the counts of entities and relations, respectively.

In practice, embedding dimension k=500k=500–$1000$ and number of negatives n=64n=64–$1024$ enable convergence in tens of epochs on standard hardware using Adam.

6. Distinguishing Advantages

RotatE's complex-space rotational approach yields several advantages:

  • It captures symmetry without entity collapse, in contrast to TransE.
  • Composition is handled via angle addition, not available in DistMult/ComplEx.
  • The combination of phase and modulus enables simultaneous modeling of symmetry, inversion, and composition.
  • The self-adversarial negative sampler accelerates learning by prioritizing difficult negative samples.

Through these mechanisms, RotatE provides a unified and algebraically expressive model that subsumes multiple relational phenomena within a simple geometric operation—rotation in Ck\mathbb{C}^k—delivering consistent performance gains across major knowledge graph benchmarks (Sun et al., 2019).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to RotatE Model.