Zep: Temporal Knowledge Graph Architecture

Updated 25 August 2025

Zep is a temporal knowledge graph architecture that organizes data into episodic, semantic, and community subgraphs to capture dynamic, time-sensitive relationships.
It utilizes a dual-timestamp model tracking event and ingestion times for precise temporal reasoning, updates, and auditability.
Zep achieves state-of-the-art retrieval performance and efficiency by dynamically integrating unstructured and structured data for adaptive AI memory.

A temporal knowledge graph (TKG) architecture encodes not only the factual structure of real-world knowledge graphs but also the temporal dynamics that underlie their evolution. Zep is a modern instantiation of such an architecture, uniquely designed as a memory layer for AI agents operating in enterprise-critical and dynamic environments. It centers on a temporally aware knowledge graph engine, Graphiti, which systematically integrates and maintains both unstructured conversational data streams and structured business information, while representing the temporal validity and provenance of every fact. Zep achieves state-of-the-art deep memory retrieval and complex temporal reasoning, as required by advanced LLM-based AI agents, through hierarchical subgraph organization, bitemporal modeling, and real-time update mechanisms.

1. Hierarchical Memory Graph Organization

Zep’s core, Graphiti, implements a multi-subgraph hierarchy optimized for real-world, temporally indexed agent memory:

Episode Subgraph: This subgraph records episodic memory, wherein each node represents a raw event or message annotated with the original event timestamp. Episodic data includes high-fidelity inputs such as JSON documents, conversation logs, and transactional snapshots, maintained as the ground truth corpus.
Semantic Entity Subgraph: Here, entities and facts emerge through semantic extraction processes from episodes. Each entity is embedded in a high-dimensional space (e.g., 1024D), enabling fine-grained semantic similarity computation via cosine distance. Relations (edges) may form complex hyperedges involving multiple entities and contextual features, supporting multi-entity facts and relationships not limited to binary.
Community Subgraph: Entities exhibiting strong connectivity and shared context are clustered inductively into communities. Dynamic label propagation algorithms update these clusters, propagating domain context and maintaining summary information for higher-level retrieval and reasoning.

This hierarchical structure aligns with contemporary models of human and computational memory, distinguishing between raw episodic recall and the abstraction of semantic knowledge.

2. Temporal and Bitemporal Modeling

A distinguishing feature of Zep is its explicit representation of both event time (T) and ingestion time (T′) for every node and edge:

Event Time (T) records when a fact or event actually occurred, anchoring the factual timeline.
Ingestion Time (T′) tracks when information was observed or added to Zep’s memory, preserving the full transaction lineage.

This bitemporal approach enables precise reasoning over scenarios involving retroactive data, corrections, and updates, as well as the invalidation or supersession of facts. The knowledge graph engine can contextualize queries and responses with respect to both the original occurrence and the most recent information state. This modeling is critical for enterprise applications where regulatory, legal, and business requirements demand full temporal traceability and auditability of knowledge states.

3. Dynamic Knowledge Integration and Update

Zep is architected to integrate data from ongoing, heterogeneous sources. Ingestion, entity extraction, semantic linking, and community clustering are performed continuously and non-lossily:

Unstructured Data Ingestion: Conversational data streams are parsed and timestamped as episodes. No lossy transformation is performed at this layer, preserving the ground truth for subsequent extraction and contextual verification.
Structured Data Integration: Business databases and record systems stream updates into the semantic subgraph, where entity resolution and fact extraction are performed in real-time.
Edge Invalidation and Temporal Logic: As new facts are ingested, edges in the semantic subgraph are updated either by inserting new relations, updating confidence or context, or invalidating prior edges if superseded. Temporal logic over the dual timeline determines the validity of facts at any query time.

Multi-hop retrieval mechanisms—such as BFS rooted at semantically relevant entities—are employed to traverse neighborhoods and clusters, while respecting temporal consistency and edge validity.

4. Performance and Benchmarking

Zep achieves leading performance in both the DMR and LongMemEval benchmarks:

Benchmark	Metric	Zep (gpt-4-turbo)	MemGPT (gpt-4-turbo)
DMR	Accuracy (%)	94.8	93.4
DMR (gpt-4o-mini)	Accuracy (%)	98.2	–
LongMemEval	Accuracy (max gain)	+18.5	–
LongMemEval	Latency Reduction (%)	–90	Baseline

Substantial improvements—in particular, up to 18.5% enhanced accuracy on LongMemEval and up to 90% reduction in response latency—are observed in cross-session reasoning, long-term context maintenance, and retrieval tasks requiring multi-hop, temporally conditioned inference.

The retrieval pipeline executes as:

$f(\alpha) = \chi(\rho(\phi(\alpha))) = \beta$

where $\phi(\alpha)$ identifies candidate nodes/edges, $\rho(\cdot)$ re-ranks based on semantic and temporal similarity, and $\chi(\cdot)$ formats the memory context for the agent LLM.

5. Comparison to Static and Classical RAG Frameworks

Conventional retrieval-augmented generation (RAG) frameworks typically index static document corpora and support only embedding- or text-based retrieval, devoid of formal temporal modeling or dynamic knowledge updates. Such systems are limited in their ability to:

Reflect new, evolving conversational/business facts
Maintain long-horizon context, especially across multiple user–agent sessions
Reason about the temporality and succession/precedence of facts

Zep’s graph-centric and temporally aware engine surpasses these constraints by:

Supporting edge invalidation and update upon new evidence, while maintaining full provenance
Enabling efficient multi-hop, temporally filtered subgraph retrieval
Dynamically integrating both unstructured and structured data into a single memory architecture

These properties are particularly critical for complex enterprise use cases, such as cross-session synthesis, procedural task support, and compliance auditing.

6. Real-world Applications and Implications

Zep’s architecture is well-suited for a range of enterprise scenarios demanding real-time, contextually rich reasoning:

Cross-session Synthesis: Zep allows AI agents to link and recall user information, queries, and actions across multiple, temporally scattered interaction episodes, a task fundamental to persistent assistant memory and regulatory workflows.
Long-term Context: Through its hierarchical and temporal memory, Zep reduces the need to re-ingest redundant data, which improves efficiency and consistency in responses, and is particularly advantageous in multi-turn or longitudinal tasks.
Auditable Knowledge and Forensic Tracing: Dual-timeline modeling ensures every fact’s lineage is preserved, making Zep suitable for compliance, fraud detection, and explainability requirements in regulated industries.
Rapid Adaptation: By supporting streaming updates and new data modalities with minimal latency (demonstrated 90% reduction compared to static baselines), Zep is deployable in environments where information freshness is mission-critical.

Future implications include the extension of Zep’s graph-centric memory to incorporate advanced entity and relation extraction models, domain-specific ontologies, and fine-grained provenance tracking, anticipating the growing complexity and velocity of knowledge integration in enterprise-grade AI deployments.

7. Outlook and Future Directions

Ongoing advances in temporal knowledge graph research, especially those integrating inductive meta-learning, neural ODE-based dynamic embeddings, and hybrid structural-semantic inference, are highly relevant to the continued evolution of architectures like Zep. Directions include:

Enhanced Temporal Reasoning: Incorporating fine-grained time interval logic, episodic-to-semantic projection, and variable-curvature geometric embeddings to improve representation of complex temporal and hierarchical patterns (Ma et al., 2018).
Dynamic Inference Adaptation: Leveraging approaches from multi-expert hybrid frameworks to differentiate reasoning over historical versus novel events (Deng et al., 17 Jun 2025).
Scalability and Efficiency: Adoption of neural architecture search and polynomial-based continuous time representations for efficient large-scale and arbitrary-time reasoning (Wang et al., 2022, Fang et al., 1 May 2024).

A plausible implication is that as LLM systems become more deeply intertwined with temporal graph architectures, the operational boundary between episodic memory, semantic synthesis, and decision support will continue to erode, yielding intelligent agents with robust, adaptive, and fully traceable memory spanning across diverse sources, timelines, and reasoning modalities.