Yearly Knowledge Embeddings
- Yearly knowledge embeddings are temporal vector representations that encode evolving annual dynamics in entities and relations.
- The approach integrates dynamic tensor methods, continual learning, and incremental updates to effectively model changing knowledge graphs.
- Empirical results on datasets like ICEWS and GDELT demonstrate improved MRR and Hits@10, validating the method's scalability and accuracy.
Yearly knowledge embeddings are vector-based representations of entities and relations in knowledge graphs that are explicitly parameterized at an annual time granularity. This approach is designed to capture temporal evolution in real-world facts, optimizing for both the retrieval and prediction of facts relevant to specific years as well as aggregation across years. The construction of yearly knowledge embeddings leverages methods from dynamic and temporal knowledge graph embedding, episodic and semantic memory modeling, and continual learning under growing or evolving graphs.
1. Formal Representation and Temporal Extension
Episodic knowledge graphs encode facts as quadruples , with time denoting the year of factual validity. These are assembled into a 4-way tensor , where if the triple is true at year and $0$ otherwise (Ma et al., 2018). Embedding models assign each entity and relation a latent vector, and analogously a vector for each year.
The temporal extension of static embedding models is achieved by appending the yearly embedding , enabling score functions such as
where generalizes the classic tensor or compositional models to include time. This framework supports DistMult-T, HolE-T, ComplEx-T, RESCAL-T, and Tucker-T, with yearly time embeddings augmenting the original score functions.
2. High-Capacity Tensor Methods and the ConT Model
To adequately capture temporal dependencies at yearly resolution, high-capacity tensor models are employed. The ConT model generalizes the Tucker decomposition by introducing a dedicated core tensor for each year, yielding the score function (Ma et al., 2018)
or in vector notation,
Parameter complexity is , enabling per-year modeling of complex relational structure. Empirically, such models outperform lower-capacity methods when entity-relation patterns shift annually.
3. Yearly Semantic Embeddings via Temporal Aggregation
The marginalization of episodic knowledge yields static (semantic) yearly embeddings. For models factorizing as , the time-projective embedding is
aggregating over yearly contributions to recover a temporally agnostic score (Ma et al., 2018). For facts with explicit start and end years, the aggregate can be adjusted:
ensuring expired events are excluded from current-year memory.
4. Incremental and Lifelong Embedding Update Mechanisms
Yearly knowledge graphs grow with new facts, entities, and relations added in each annual snapshot. Methods for maintaining up-to-date yearly embeddings without full retraining fall into two main families:
- Incremental Optimization (Wewer et al., 2021): Following each year's changes , analytic ("positive-only") or gradient-based initialization sets new embeddings. Training proceeds by interleaving "general" epochs (all data) with "change"-focused epochs (recent additions/removals), with learning rates and epochs scheduled to maintain stability. For yearly graphs with up to 10% changes per year, 20 "change" epochs to 180 "general" epochs and learning rate ratios $1:5$ are optimal. This yields 1% absolute drop in MRR compared to full retraining, with 15–30% of the runtime.
- Lifelong Masked Autoencoding and Transfer (Cui et al., 2022): Each year, new entity/relation embeddings are initialized via neighbor aggregation from prior-year embeddings. A masked autoencoder reconstructs masked entity/relation vectors from their (old+new) context, facilitating both update and transfer. The objective regularizes changes to old embeddings to avoid catastrophic forgetting, weighting by the frequency of appearances in the current snapshot. This approach produces embeddings that integrate new information while stably retaining past knowledge, suitable for decade-scale continuous KG growth.
5. Adaptive Dimension and Capacity in Yearly Evolution
Graph growth can necessitate adaptive embedding dimension. SAGE (Scale-Aware Gradual Evolution) proposes a principled mechanism for yearly adjustment (Li et al., 15 Aug 2025). Embedding dimension at year is set via a scale-to-parameter law,
Given confidence intervals , the dimension is adapted according to a growth rule:
This ensures sufficient capacity for increasingly complex yearly graphs while preventing overexpansion. Each expansion is accompanied by dynamic distillation: highly reliable historical embeddings are preserved, while more novel or changed entities/relations are optimized for current-year data. Stage-wise optimization minimizes margin loss over hard samples and a distillation loss penalized by footprint ratios, maintaining both knowledge acquisition and retention.
6. Empirical Performance and Practical Guidelines
Empirical evaluation on public datasets—ICEWS (72 yearly steps, 320K quadruples) and GDELT (365 steps, 2.56M quadruples)—demonstrates the superiority of high-capacity, temporally aware embeddings for annual predictions. Tables below summarize representative results (Ma et al., 2018, Li et al., 15 Aug 2025):
| Model | Dataset | Entity MRR | Timestamp MRR | H@10 |
|---|---|---|---|---|
| DistMult-T | ICEWS | 0.222 | — | — |
| HolE-T | ICEWS | 0.229 | — | — |
| ComplEx-T | ICEWS | 0.229 | 0.354 | — |
| Tucker-T | ICEWS | 0.257 | 0.923 | — |
| ConT | ICEWS | 0.264 | 0.982 | — |
| SAGE | ENTITY | 0.280 | — | 0.477 |
SAGE achieves improvements of up to 1.4% in MRR and 1.6% in Hits@10 over previous continual KG embedding (CKGE) methods after five yearly updates. Incremental update methods (Wewer et al., 2021) maintain within 1% absolute MRR of full retraining at a fraction of the runtime.
Best practices for construction and maintenance:
- Choose embedding dimension per model family: Tucker/RESCAL ; compositional .
- Tune margin (), learning rate, step size, and regularization via annual validation.
- For dynamic capacity: update using regression-derived scaling laws.
- Monitor evaluation metrics (MRR, Hits@k) after each year; early-stop if no improvement.
- Combine yearly negative sampling (corrupt or ) with proper weighting for efficient training.
7. Synthesis and Outlook
Yearly knowledge embeddings formalize the representation of evolving real-world knowledge by associating entities and relations with vectors parameterized per year, leveraging temporal extensions of standard embedding models, high-capacity tensor factorizations, continual learning, and adaptive capacity control (Ma et al., 2018, Cui et al., 2022, Wewer et al., 2021, Li et al., 15 Aug 2025). The resulting frameworks enable effective link prediction, knowledge retention, and scalable updates across large and continuously growing knowledge graphs. Empirical results on benchmark datasets affirm the necessity of temporal modeling and dynamic adaptation for practical knowledge graph maintenance. A plausible implication is that as knowledge graphs grow over years or decades, such yearly embedding methods will be central in applications requiring both historical and current-state reasoning, including question answering, event prediction, and longitudinal analysis.