Tensor Memory Hypothesis Overview

Updated 24 April 2026

Tensor Memory Hypothesis is a unified framework that encodes perception, episodic, and semantic memory using high-order tensor decompositions.
It employs Tucker-type factorizations to combine continuous latent embeddings with discrete symbolic indices, facilitating memory and reasoning.
The framework offers biologically motivated and algorithmically effective insights into memory consolidation, long-range dependencies, and semantic decoding.

The Tensor Memory Hypothesis (TMH) is a mathematical, computational, and cognitive framework asserting that high-order (tensor) decompositions over latent representations constitute the core mechanism by which perception, episodic memory, and semantic memory are realized and interconnected. Declarative knowledge, sensory experience, and their generalizations are encoded, queried, and manipulated by multi-way tensor structures connecting distributed (subsymbolic) and symbolic representations across time, concepts, and relations. This paradigm unifies a broad range of neurocognitive, statistical, and dynamical phenomena, including memory formation, consolidation, and reasoning, within a compact factorization-based formalism (Tresp et al., 2017, Tresp et al., 2024, Tresp et al., 2021).

1. Formal Structure and Core Claims

In TMH, each discrete symbol—entity, predicate, or time index—is associated with a continuous latent embedding $\mathbf{a}_e\in\mathbb{R}^{\tilde r}$ . Observed events are indexed as quadruples $(s,p,o,t)$ , and represented in a sparse 4-way episodic tensor $X^e$ , while time-independent semantic facts $(s,p,o)$ are stored in a 3-way semantic tensor $X^s$ (Tresp et al., 2017).

The key decompositions are Tucker-type factorizations:

Episodic memory:

$\theta_{s,p,o,t} = f^e(\mathbf a_{e_s},\mathbf a_{e_p},\mathbf a_{e_o},\mathbf a_{e_t}) = \sum_{r_1,r_2,r_3,r_4} a_{e_s,r_1}\;a_{e_p,r_2}\;a_{e_o,r_3}\;a_{e_t,r_4}\;g^e(r_1,r_2,r_3,r_4)$

Semantic memory (marginalized over time):

$\theta_{s,p,o} = f^s(\mathbf a_{e_s},\mathbf a_{e_p},\mathbf a_{e_o}) = \sum_{r_1,r_2,r_3} a_{e_s,r_1}\;a_{e_p,r_2}\;a_{e_o,r_3}\;g^s(r_1,r_2,r_3)$

Probabilities for observing specific facts/events are obtained via sigmoidal activations over these scores. Novel sensory episodes are encoded to latent vectors and bound to hippocampal indices, while semantic decoding during perception invokes sampling from the episodic tensor, generating candidate (subject, predicate, object) triples (Tresp et al., 2017).

A crucial mechanism is the top-down and bottom-up communication between distributed representation (representation layer/global workspace) and symbolic index layers, orchestrated via embeddings that form the "DNA" of concepts. This bidirectional mapping grounds both perception and symbolic cognition (Tresp et al., 2024, Tresp et al., 2021).

2. Biological and Computational Substrate

The TMH framework is biologically motivated by hippocampal indexing theory. Episodic memory formation occurs when a new index $e_t$ is allocated by the hippocampus and the corresponding embedding $\mathbf{a}_{e_t}$ —originating from high-order cortical activity—is stored as an engram. Recall is achieved by reactivating $e_t$ and thus reinstating $(s,p,o,t)$ 0, which reconstructs distributed cortical states (Tresp et al., 2017).

The semantic system arises as a time-marginalized, cortex-local factorization, trained by marginalizing or replaying episodic traces. This mathematically realizes "multiple trace theory" and "complementary learning systems," where consolidation entails cortex learning by accumulation of general co-occurrence statistics or explicitly via replayed triples, rather than direct transfer of hippocampal indices (Tresp et al., 2017, Tresp et al., 2021).

Computationally, the representation layer (global workspace) maintains a high-dimensional activation vector $(s,p,o,t)$ 1, integrating feedforward sensory streams ( $(s,p,o,t)$ 2), recurrence, and top-down contribution from active symbolic indices. The index layer computes a symbolic distribution via softmax over $(s,p,o,t)$ 3, and in turn, activated symbolic content modifies $(s,p,o,t)$ 4 by projecting embeddings back into distributed representation, closing the loop between overt symbols and subsymbolic brain state (Tresp et al., 2024).

3. Algorithmic and Operational Modes

The TMH unifies episodic and semantic memory, perceptual decoding, and internal reasoning under a single Bilayer Tensor Network (BTN) model (Tresp et al., 2021). Different operational modes are realized via gating and index selection:

Perception Mode: Bottom-up sensory input dominates, embedding causal context in global workspace, from which symbolic content is decoded sequentially (subject → object → predicate).
Episodic Memory Mode: Particular time index is injected; perceptual drive is absent, and previously stored embeddings are sampled and decoded.
Semantic Memory Mode: Time index replaced with a global semantic index, producing unconditional (timeless) triples reflecting consolidated world knowledge.

Oscillatory (e.g., theta-gamma) gating plausibly underlies alternation between these processing regimes, consistent with observed neural rhythms (Tresp et al., 2021).

Memory formation and consolidation are expressed as operations on the tensor parameter spaces—for example, marginalizing episodic tensors over time to update semantic cores:

$(s,p,o,t)$ 5

Triple replay and knowledge-graph storage are alternative, complementary mechanisms for semantization (Tresp et al., 2017).

4. Dynamics, Learning, and Tensor-Power Models

The TMH provides a rationale for the critical role of high-order (tensor) interactions in generating long-range dependencies, both in the brain and in recurrent artificial networks (Qiu et al., 2021). In tensor-power recurrent models, the recurrence at each step involves a $(s,p,o,t)$ 6-fold product over current and previous states. The tractable memory coefficient $(s,p,o,t)$ 7 characterizes the degree to which inputs $(s,p,o,t)$ 8 steps in the past affect current state.

A large tensor degree $(s,p,o,t)$ 9 is necessary to achieve slow decay of memory and escape the short-memory regime. However, this is countered by instability for large integer $X^e$ 0, prompting continuous relaxations and CP-decomposition-based parametrizations for stability and learnability (Qiu et al., 2021). The result is a nuanced refinement: high-order tensor interactions support long memory, but their effective exponent must be adaptively calibrated.

Empirical validation confirms competitive or superior performance on long-memory benchmarks when using adaptive tensor-power models, demonstrating that the TMH is not merely a neurocognitive proposal but is also algorithmically effective in practical sequence learning (Qiu et al., 2021).

5. Extensions: Markov Chains, Hypergraphs, and Statistical Physics

A generalized, formal TMH emerges in stochastic processes with memory, where higher-order Markov chains with memory depth $X^e$ 1 are exactly characterized via even-order paired tensors $X^e$ 2. The Einstein-product action of $X^e$ 3 on joint history tensors reproduces Chapman-Kolmogorov updates and captures full memory dependence. Under mild irreducibility, Perron-Frobenius theory for nonnegative tensors yields unique steady-state memory distributions.

Mean-field closures map these high-dimensional tensor systems to nonlinear eigenvalue problems (Z- or H-eigenvectors), producing global convergence under higher-order detailed balance. This formalism extends to random walks on hypergraphs, where group structure and time-dependent effects naturally induce memory, again unified by tensor factorization (Cui et al., 8 Apr 2026).

A summary of the tensor-based Markov chain paradigm:

Domain	Memory Representation	Stationarity Condition
Markov chain	Order- $X^e$ 4 transition tensor $X^e$ 5	$X^e$ 6
Hypergraph walk	Order- $X^e$ 7 adjacency tensor $X^e$ 8	$X^e$ 9

These topics broaden the TMH from neurocognitive architectures to statistical physics, network science, and random processes.

6. Semantic Decoding, Reasoning, and Inner Language

The TMH-based models explain semantic decoding in perception as sequential sampling of symbolic Subject-Predicate-Object (SPO) triples via shared embeddings. Knowledge graphs, adjacency tensors, and embedding-based generative models unify neural and symbolic representations. This supports compositional reasoning, chaining, generative “inner language,” and alignment of perceptual and memory-driven inference (Tresp et al., 2020, Tresp et al., 2021).

A four-layer decoder—sensory memory buffer, representation blackboard, index layer, and working memory—implements sequential selection, updating, and broadcasting of symbolic content. The architecture is constrained to avoid excessive multiplication, in line with biological costs (Tresp et al., 2020).

Bayesian interpretations recast semantic memory as the prior for perception, with the posterior derived via shared embedding structures, leveraging mathematical tractability from conjugate models (Tresp et al., 2020).

7. Implications, Controversies, and Outlook

The TMH advances several implications:

Declarative memory, perception, and reasoning are expressions of the same high-order embedding factorization.
Subsymbolic (continuous) and symbolic (discrete) computations are unified by bidirectional mapping via embeddings.
Memory consolidation occurs not by literal transfer but by marginalization and replay, refining the semantic core tensor over time.
Self-supervised learning mechanisms embedded within the TMH continuously update embeddings and structure as new data is encountered (Tresp et al., 2021, Tresp et al., 2024).
Symbolic reasoning and generalization—e.g., concept chains, analogies—emerge from tensor-based sampling or attention over embedding distributions.

Controversies and boundary cases are addressed explicitly. For instance, in general relativity, the "pure Tensor Memory Hypothesis" asserts that only tensor perturbations contribute to gravitational-wave memory effects, but in more general scalar-tensor theories, scalar contributions must be included, with distinct redshifting and observability signatures (Gorji et al., 2022).

The TMH continues to inform neurobiological theories, recurrent neural architecture, cognitive modeling, and higher-order network science, substantiating its role as a unifying principle for memory-dependent systems (Tresp et al., 2017, Tresp et al., 2024, Cui et al., 8 Apr 2026).