Papers
Topics
Authors
Recent
Search
2000 character limit reached

KBGAN: Adversarial Sampling in KGE

Updated 21 April 2026
  • KBGAN is an adversarial training framework for knowledge graph embedding that uses a generator and a discriminator to create challenging negative samples.
  • The framework employs a softmax-based generator and a margin-based discriminator to progressively refine negative examples and enhance relational learning.
  • Empirical evaluations on benchmarks like FB15k-237 and WN18 demonstrate significant improvements in Mean Reciprocal Rank and Hits@10.

KBGAN is an adversarial training framework designed to enhance knowledge graph embedding (KGE) models by addressing the limitations of negative sampling in knowledge graph completion tasks. KBGAN pairs two KGE models—a generator and a discriminator—in an adversarial setup inspired by generative adversarial networks (GANs), where the generator proposes hard negative samples for the discriminator, resulting in improved embedding quality and link prediction performance across standard benchmarks (Cai et al., 2017).

1. Motivation and Limitations of Uniform Negative Sampling

Standard knowledge graphs store only positive triples (h,r,t)(h, r, t) representing observed facts. KGE methods require negative examples to learn the underlying structure; the conventional approach creates these by corrupting head or tail entities with randomly selected alternatives. These "uniform negatives" often result in trivially false triples, such as LocatedIn(NewOrleans, BarackObama), which predominantly violate basic type constraints. Training with such "type-mismatched" negatives encourages models to focus only on type consistency rather than more nuanced relational errors (e.g., confusing LocatedIn(NewOrleans, Florida) with a correct triple). As a result, margin-based embedding methods, including TransE and TransD, are limited in their capacity to capture rich relational patterns beyond entity type distinctions (Cai et al., 2017).

2. Adversarial Sampling Philosophy and Core Components

KBGAN adapts the adversarial philosophy of GANs to discrete structured data in KGE. The framework introduces a generator KGE model, selected for its probabilistic, softmax-based scoring, and a discriminator KGE model, chosen for its margin-based distance scoring.

  • Generator (GG): Defines a softmax distribution over a candidate set of negatives for a given positive triple (h,r,t)(h, r, t):

pG(h,r,th,r,t)=exp(fG(h,r,t))(h,r,t)Neg(h,r,t)exp(fG(h,r,t))p_G(h', r, t' \mid h, r, t) = \frac{\exp(f_G(h', r, t'))}{\sum_{(h^*, r, t^*) \in Neg(h, r, t)} \exp(f_G(h^*, r, t^*))}

  • Discriminator (DD): Margin-based model with score fD(h,r,t)f_D(h, r, t) interpreted as a distance, seeking to minimize distance for positives and maximize it for adversarially-generated negatives.

The generator iteratively learns to propose negatives that the discriminator currently fails to distinguish from true triples, leading to a curriculum of increasingly hard negatives. This process sharpens the discriminator's embedding function, yielding improved performance and finer decision boundaries in the learned embedding space.

3. Objective Functions and Optimization

The learning process comprises two principal objectives:

  • Discriminator Margin Loss:

LD=(h,r,t)T  E(h,r,t)pG[[fD(h,r,t)fD(h,r,t)+γ]+]L_D = \sum_{(h, r, t) \in \mathcal{T}}\;\mathbb{E}_{(h', r, t') \sim p_G} \left[\, [f_D(h, r, t) - f_D(h', r, t') + \gamma]_+ \,\right]

where [x]+=max(0,x)[x]_+ = \max(0, x) and γ>0\gamma > 0 is the margin.

  • Generator Reward Objective:

RG=(h,r,t)TE(h,r,t)pG(fD(h,r,t))R_G = \sum_{(h, r, t) \in \mathcal{T}} \mathbb{E}_{(h', r, t') \sim p_G} \left( - f_D(h', r, t') \right)

Optimized by REINFORCE policy gradients:

GG0

A running average baseline is subtracted from rewards to reduce variance.

4. Training Procedure and Hyperparameterization

KBGAN training consists of (a) pre-training both generator and discriminator with standard negative sampling, and (b) joint adversarial updates using alternating gradient steps. The batch-level training pseudo-code is:

(h,r,t)(h, r, t)6

Key hyperparameters: batch size GG1 (dataset-dependent), GG2 negative candidates per positive, optimizer = Adam with GG3, GG4, GG5, margin GG6, embedding dimension GG7, GG8 distance for TransE/TransD, and L2-regularization for DistMult/ComplEx (Cai et al., 2017).

5. Model Variants and Instantiations

KBGAN is model-agnostic, compatible with mature KGE architectures that support the respective softmax and margin scoring functions. The original experiments instantiate the following:

  • Discriminator (margin-based):
    • TransE: GG9, unit-norm constraints on embeddings.
    • TransD: (h,r,t)(h, r, t)0, unit-norm constraints.
  • Generator (softmax-based):
    • DistMult: (h,r,t)(h, r, t)1 with L2-regularization.
    • ComplEx: (h,r,t)(h, r, t)2 in (h,r,t)(h, r, t)3.

There is no modification to the core adversarial or sampling logic; only the functional form and constraints of (h,r,t)(h, r, t)4 and (h,r,t)(h, r, t)5 change across variants.

6. Empirical Evaluation and Findings

Evaluation is conducted on standard knowledge base completion benchmarks:

  • FB15k-237: 237 relations, 14,541 entities, 272,115 train triples.
  • WN18: 18 relations, 40,943 entities, 141,442 train triples.
  • WN18RR: 11 relations, 40,943 entities, 86,835 train triples (removes inverse shortcuts).

Performance is measured by filtered Mean Reciprocal Rank (MRR) and Hits@10. The following table summarizes core results (filtered setting):

Discriminator Dataset Pre-trained MRR / H@10 KBGAN MRR / H@10 ΔMRR
TransE + DistMult FB15k-237 24.2 / 42.2 27.4 / 45.0 +3.2
TransE + ComplEx FB15k-237 24.2 / 42.2 27.8 / 45.3 +3.6
TransD + DistMult FB15k-237 24.5 / 42.7 27.8 / 45.8 +3.3
TransD + ComplEx FB15k-237 24.5 / 42.7 27.7 / 45.8 +3.2
TransE + DistMult WN18 43.3 / 91.5 71.0 / 94.9 +27.7
TransD + DistMult WN18 49.4 / 92.8 77.2 / 94.8 +27.8
TransE + DistMult WN18RR 18.6 / 45.9 21.3 / 48.1 +2.7
TransD + ComplEx WN18RR 19.2 / 46.5 21.5 / 46.9 +2.3

Across all (generator, discriminator) combinations and datasets, KBGAN provides MRR improvements of 2–28 points and 2–4 point gains in Hits@10. The most pronounced benefit occurs on WN18, where uniform negatives are particularly uninformative due to strong inverse relations; adversarial sampling compels finer discrimination (Cai et al., 2017).

Qualitative analysis indicates that after adversarial training, the generator produces "hard negatives"—semantically related yet incorrect triples (e.g., selecting “bond_NN_6” for “meeting, hypernym, social_gathering”), unlike the random, type-irrelevant negatives of conventional sampling.

7. Conclusion and Implications

KBGAN demonstrates that adversarial sampling can be effectively transposed to KGE, yielding a flexible and architecture-agnostic method for constructing high-quality negatives through a policy-gradient-trained generator. This results in consistent improvement over baseline KGE models for knowledge base completion. The ability to mix and match mature embedding models as generator and discriminator components facilitates broad applicability. Empirical gains, particularly on benchmarks where traditional negatives are uninformative, highlight the critical role of hard negative sampling in KGE algorithm design (Cai et al., 2017).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to KBGAN.