RotatE: Complex-Space Knowledge Graph Embeddings

Updated 24 December 2025

RotatE is a knowledge graph embedding model that represents entities and relations via complex rotations, effectively modeling symmetry, inversion, and composition.
It employs a self-adversarial negative sampling mechanism to prioritize challenging examples, boosting training efficiency and prediction accuracy.
Empirical evaluations on benchmarks like FB15k and WN18 demonstrate that RotatE achieves state-of-the-art performance while maintaining scalability.

RotatE is a knowledge graph embedding model that represents entities and relations in the complex vector space $\mathbb{C}^k$ , where relations are modeled as rotations and entity–relation interactions are computed via element-wise (Hadamard) multiplication. RotatE is designed to capture key relational patterns in knowledge graphs—such as symmetry, antisymmetry, inversion, and composition—within a unified algebraic and geometric framework. The model incorporates a novel self-adversarial negative sampling mechanism to improve training efficiency and representation quality, and demonstrates state-of-the-art performance in standard link prediction benchmarks (Sun et al., 2019).

1. Mathematical Foundation

Let $\mathbf{e}_h, \mathbf{e}_t \in \mathbb{C}^k$ denote embeddings for the head and tail entities, and $\mathbf{r} \in \mathbb{C}^k$ the embedding for a relation, with each coordinate constrained as $|r_i|=1$ . Each $r_i$ is thereby parameterized as $r_i = e^{i\theta_{r,i}}$ , establishing $\mathbf{r}$ as a vector of unit complex numbers (rotations). A true triple $(h, r, t)$ is modeled by enforcing $\mathbf{e}_h \circ \mathbf{r} \approx \mathbf{e}_t$ , where $\circ$ denotes the Hadamard product.

The scoring function for a candidate triple is defined as the negative $L_1$ distance over complex coordinates:

$f_r(h,t) = -\left\| \mathbf{e}_h \circ \mathbf{r} - \mathbf{e}_t \right\|_1$

This scoring captures how well the rotated head entity matches the tail, providing a basis for link prediction via ranking.

2. Modeling Relation Patterns

RotatE's geometric and algebraic structure enables explicit modeling of multiple relation patterns:

Symmetry/Antisymmetry: For a relation $r$ , symmetry requires $r_i^2=1$ or equivalently $r_i = \pm 1$ for all $i$ ; thus, symmetric relations correspond to phases $\theta_{r,i}\in\{0, \pi\}$ . If $r \circ r \ne \mathbf{1}$ , antisymmetry is modeled. Lemma 1 formalizes: $r$ is symmetric iff $r_i=\pm1$ for all $i$ , and antisymmetric otherwise.
Inversion: For inverses $r_1$ and $r_2$ , the complex conjugate ensures inverse relation: $\mathbf{r}_2 = \overline{\mathbf{r}_1} = \mathbf{r}_1^{-1}$ . Lemma 2 shows RotatE represents inversion by conjugate embeddings.
Composition: The element-wise product models composition: for $r_3$ as the composition of $r_1$ then $r_2$ , $\mathbf{r}_3 = \mathbf{r}_1 \circ \mathbf{r}_2$ , implying $\theta_{3,i} = \theta_{1,i} + \theta_{2,i} \pmod{2\pi}$ . Lemma 3 formalizes composition via the Hadamard product.

These relational pattern properties are direct consequences of RotatE's embedding construction and underlying complex arithmetic.

3. Training Procedure and Self-Adversarial Negative Sampling

RotatE is trained to minimize a margin-based negative sampling loss, enhanced with a self-adversarial negative sampler:

The margin-based loss for a positive triple $(h, r, t)$ and a set of $n$ negative samples $\{(h_i', r, t_i')\}$ with distance $d_r(h, t) = \|\mathbf{e}_h \circ \mathbf{r} - \mathbf{e}_t\|$ is:

$L = -\log \sigma(\gamma - d_r(h,t)) - \sum_{i=1}^n p(h_i', r, t_i') \log \sigma(d_r(h_i', t_i') - \gamma)$

where $\sigma$ is the sigmoid and $p(h',r,t')$ weights the negatives adversarially:

$p(h',r,t') = \frac{\exp(\alpha f_r(h',t'))}{\sum_{j=1}^n \exp(\alpha f_r(h_j',t_j'))}$

with $\alpha$ as a temperature hyperparameter. Harder negatives receive higher weights, focusing learning on challenging examples. Parameters are optimized with Adam, and no additional regularization is required aside from the unit-modulus constraint.

4. Empirical Performance and Evaluation

RotatE's performance is evaluated on link prediction benchmarks (FB15k, WN18, FB15k-237, WN18RR) and composition-pattern tasks (Countries S1–S3), with metrics including filtered Mean Rank (MR), Mean Reciprocal Rank (MRR), and Hits@K. Results under the filtered protocol are summarized:

Dataset	MR	MRR	Hits@1	Hits@10
FB15k	40	0.797	0.746	0.884
WN18	309	0.949	0.944	0.959
FB15k-237	177	0.338	–	0.533
WN18RR	3340	0.476	–	0.571

On FB15k-237 and WN18RR, RotatE surpasses ConvE and ComplEx in MRR and Hits@10. In composition-based probing (Countries S1–S3), RotatE achieves AUC-PR: 1.00 (S1), 1.00 (S2), 0.95 (S3), outperforming DistMult and ComplEx on longer composition chains.

Ablation studies reveal that removing self-adversarial negative sampling reduces MRR by approximately 3–4 points. The "pRotatE" baseline, with modulus fixed, establishes that variable moduli are essential for capturing compositionality.

5. Computational Complexity and Scalability

Per triple, RotatE requires $O(k)$ operations (element-wise multiply and subtract). For a batch of size $B$ and $n$ negatives, a training step incurs $O(B n k)$ computation. The memory footprint is $O((|\mathcal{E}| + |\mathcal{R}|) k)$ , where $|\mathcal{E}|$ and $|\mathcal{R}|$ are the counts of entities and relations, respectively.

In practice, embedding dimension $k=500$ –$1000$ and number of negatives $n=64$ –$1024$ enable convergence in tens of epochs on standard hardware using Adam.

6. Distinguishing Advantages

RotatE's complex-space rotational approach yields several advantages:

It captures symmetry without entity collapse, in contrast to TransE.
Composition is handled via angle addition, not available in DistMult/ComplEx.
The combination of phase and modulus enables simultaneous modeling of symmetry, inversion, and composition.
The self-adversarial negative sampler accelerates learning by prioritizing difficult negative samples.

Through these mechanisms, RotatE provides a unified and algebraically expressive model that subsumes multiple relational phenomena within a simple geometric operation—rotation in $\mathbb{C}^k$ —delivering consistent performance gains across major knowledge graph benchmarks (Sun et al., 2019).

PDF Markdown Chat (Pro)

References (1)

RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space (2019)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to RotatE Model.