Temporal Knowledge Graph Completion: A Survey (2201.08236v1)

Published 16 Jan 2022 in cs.AI and cs.LG

Abstract: Knowledge graph completion (KGC) can predict missing links and is crucial for real-world knowledge graphs, which widely suffer from incompleteness. KGC methods assume a knowledge graph is static, but that may lead to inaccurate prediction results because many facts in the knowledge graphs change over time. Recently, emerging methods have shown improved predictive results by further incorporating the timestamps of facts; namely, temporal knowledge graph completion (TKGC). With this temporal information, TKGC methods can learn the dynamic evolution of the knowledge graph that KGC methods fail to capture. In this paper, for the first time, we summarize the recent advances in TKGC research. First, we detail the background of TKGC, including the problem definition, benchmark datasets, and evaluation metrics. Then, we summarize existing TKGC methods based on how timestamps of facts are used to capture the temporal dynamics. Finally, we conclude the paper and present future research directions of TKGC.

Authors (6)

Borui Cai (5 papers)
Yong Xiang (38 papers)
Longxiang Gao (38 papers)
He Zhang (236 papers)
Yunfeng Li (14 papers)
Jianxin Li (128 papers)

Citations (73)

View on Semantic Scholar

Summary

Temporal Knowledge Graph Completion (TKGC) addresses the challenge of inferring missing facts in knowledge graphs (KGs) by incorporating the temporal dimension. Unlike traditional Knowledge Graph Completion (KGC) methods that treat KGs as static structures, TKGC recognizes that facts evolve over time. Real-world KGs, such as those used in search engines, recommender systems, or financial analysis, are inherently dynamic, with entities gaining or losing relations at specific times. Predicting when a fact holds true or which entity is involved at a given time requires models that can capture these temporal dynamics. This survey provides a comprehensive overview of existing TKGC methods, categorizing them based on how they integrate timestamp information.

A knowledge graph in the temporal context is represented as a collection of facts, each being a quadruple $\{h, r, t, \tau\}$ , where $h$ is the head entity, $r$ is the relation, $t$ is the tail entity, and $\tau$ is the timestamp (either a point or an interval). The goal of TKGC is to predict missing elements in these quadruples, most commonly the head or tail entity given $\{?, r, t, \tau\}$ or $\{h, r, ?, \tau\}$ , but also potentially the relation $\{h, ?, t, \tau\}$ or the timestamp $\{h, r, t, ?\}$ .

Training TKGC models typically involves learning low-dimensional embeddings for entities, relations, and potentially timestamps or time-aware transformations. These embeddings are used within a factual score function, $q(s)$ , which measures the likelihood or correctness of a fact $s=\{h, r, t, \tau\}$ . Models are trained by minimizing a loss function that encourages higher scores for true facts than for negative samples (corrupted versions of true facts). Common loss functions include Margin Ranking Loss, Cross Entropy Loss, and Binary Cross Entropy Loss, with the latter being favored for neural network-based methods due to its computational convenience.

Benchmark datasets for TKGC evaluation include ICEWS (Integrated Crisis Early Warning System), GDELT (Global Database of Events, Language, and Tone), YAGO15K, and WIKIDATA. ICEWS and GDELT provide event data with discrete time points, while YAGO15K and WIKIDATA use time intervals. Practical evaluation often employs time-aware filtering to ensure candidate entities are valid at the given timestamp. Metrics like Hits@k, Mean Ranking (MR), and Mean Reciprocal Ranking (MRR) are used to measure prediction accuracy, focusing on the rank of the true answer among candidate entities. A key challenge is evaluating performance on unseen timestamps, including predicting future events or imputing missing times for existing facts.

Existing TKGC methods can be broadly categorized by their approach to timestamp integration:

Timestamp-included Tensor Decomposition: These methods view the temporal knowledge graph as a 4-way tensor (head, relation, tail, time). Standard tensor decomposition techniques like Canonical Polyadic (CP) decomposition or Tucker decomposition are extended to handle this fourth dimension.
- Implementation: Entity, relation, and timestamp embeddings are learned as factor matrices. The factual score is typically the dot product of the corresponding embeddings (for CP) or involves a core tensor (for Tucker).
- Practical Considerations: These methods are generally light-weight and easy to train. Extensions like using complex-valued or multivector embeddings can increase expressiveness, particularly for capturing asymmetric relations or complex temporal interactions. Temporal smoothness penalties can be added to encourage embeddings of adjacent timestamps to be similar, reflecting the gradual evolution of facts.
Timestamp-based Transformation: These methods learn static entity and relation embeddings and use timestamps to transform these static representations into time-dependent ones.
- Implementation:
  - Synthetic Time-dependent Relation: Timestamps are concatenated with relations to create new, time-specific relations (e.g., isPresidentOf:2020). This allows applying existing static KGC models. The timestamp-relation combination can be learned via simple fusion functions (like summation) or more complex sequence models (like LSTMs or attention mechanisms) to capture temporal patterns and potentially adapt to different time granularities.
  - Linear Transformation: Timestamps are modeled as transformations (e.g., projection onto hyperplanes or complex-space rotations) applied to static entity/relation embeddings. The factual score is calculated using these transformed embeddings. Models might use sequences of transformations (e.g., learned via GRUs) to capture dynamics or encode timestamps into structured vectors to handle various precisions.
- Practical Considerations: This approach leverages existing KGC models and adapts them. Using sequence models for synthetic relations can handle variable timestamp formats. Linear transformations offer a potentially more continuous view of temporal change.
Dynamic Embedding: These methods explicitly model the temporal evolution of entity and relation embeddings over time.
- Implementation:
  - Representations as Functions of Timestamp: Embeddings are defined as mathematical functions of time, potentially decomposing into static, trend, and seasonal components. This allows embeddings to vary continuously with time. Some methods model these dynamics in non-Euclidean spaces like hyperbolic space to better capture hierarchical structures. Diachronic embeddings combine a static part with a time-varying part, often implemented as a neural network taking time as input.
  - Representations as Hidden States of RNN: Recurrent Neural Networks (RNNs) or Gated Recurrent Units (GRUs) are used to model the sequence of entity or relation states over time. The hidden state at time $\tau$ becomes the dynamic embedding at that time, integrating information from past events. Some models combine structural encoders (like GNNs operating on graph snapshots) with temporal encoders (like RNNs) to capture both graph structure and temporal dynamics. Practical challenges like data sparsity and temporal heterogeneity are addressed using techniques like imputation or frequency-based gating.
- Practical Considerations: Dynamic embedding models are powerful for capturing complex, non-linear temporal evolution. RNN-based methods are well-suited for sequential data but can be computationally expensive for long sequences. Handling sparsity (many entities/relations are inactive at many timesteps) is a key implementation challenge.
Learning from Knowledge Graph Snapshots: This approach treats the temporal KG as a sequence of static graph snapshots, one for each timestamp. The dynamics are learned by modeling the transitions or dependencies between these snapshots.
- Implementation:
  - Markov Process Models: The state of the KG at time $\tau$ is modeled as depending only on the state at $\tau-1$ . This can involve learning transition matrices between snapshots or modeling entities/relations with probabilistic representations (like Gaussian distributions) that evolve over time. Training often involves recursive updates.
  - Autoregressive Models: Fact prediction at time $\tau$ depends on a window of previous snapshots ( $\tau-m$ to $\tau-1$ ). Graph neural networks (GNNs) are often applied to individual snapshots to capture structural information, and recurrent components or attention mechanisms are used to aggregate information across the historical sequence. Continuous-time models (like Neural ODEs) can also be used to model smooth transitions between snapshots.
- Practical Considerations: This view simplifies the problem into processing a sequence of static graphs. GNNs are effective for capturing structural patterns within snapshots. Autoregressive models can capture longer-term dependencies but require storing and processing historical snapshots. Handling the varying structure and entity/relation presence across snapshots is crucial.
Reasoning with Historical Context: These methods leverage the chronological order of facts to perform explicit reasoning based on historical events related to a query.
- Implementation:
  - Attention-based Relevance: Attention mechanisms are used to identify and weight relevant historical facts (e.g., those sharing entities with the query) to inform the prediction. This can involve expanding inference subgraphs or propagating attention scores through paths on the graph. Temporal displacement (time difference) features are often incorporated to capture temporal relevance.
  - Heuristic-based Relevance: External domain knowledge or observed patterns (e.g., recurring events, tendencies between entities/relations) are used to define heuristic measures of relevance for historical facts. Historical facts are then aggregated or used in specific modes (like a 'copy' mode for repeating events) based on these heuristics.
- Practical Considerations: These methods can provide interpretability by highlighting the historical facts used for prediction. Attention mechanisms require careful design to handle sparse temporal data. Heuristics require domain knowledge or pattern analysis specific to the dataset.

Despite recent progress, applying TKGC to real-world scenarios faces several limitations and opens future research directions:

Incorporating External Knowledge: Performance, especially on complex datasets like GDELT, is limited by data sparsity and long-tail distributions. Integrating semantic information (textual descriptions, entity types) from external sources or using pre-trained LLMs (like BERT) could enrich representations and improve prediction accuracy by providing additional context beyond the graph structure and timestamps.
Time-aware Negative Sampling: Effective negative sampling is crucial for representation learning but is more challenging in the temporal setting due to the complex interaction between facts and time. Research is needed to develop methods that generate realistic negative samples that respect temporal constraints.
Larger-scale Knowledge Graphs: Current methods struggle to scale to real-life KGs with billions of facts. Developing distributed training strategies that efficiently handle temporal data across multiple nodes and exploring parameter reduction techniques like compositional embeddings (representing entities/relations as compositions of shared features) are vital for practical deployment.
Evolutionary Knowledge Graphs: Real-world KGs are constantly updated. Training models from scratch for each update is impractical. TKGC needs to be framed as an incremental or continual learning problem. Techniques like experience replay, knowledge distillation, regularization, and progressive neural networks need further exploration to enable models to learn from streaming data without forgetting previously acquired knowledge.

In summary, TKGC is an essential task for working with dynamic knowledge graphs. The field has seen significant advancements by incorporating temporal information through various strategies, from extending tensor models to developing sophisticated dynamic embeddings and reasoning mechanisms. However, scalability, handling incomplete data, and adapting to continuously evolving graphs remain key challenges for widespread real-world adoption.

PDF Markdown

Related Papers

Find Related Papers