Papers
Topics
Authors
Recent
Search
2000 character limit reached

Entropy-Memorization Law Overview

Updated 18 March 2026
  • The Entropy-Memorization Law is a principle that quantifies how system constraints imprint irreversible entropy, linking memory and disorder across physical and computational domains.
  • It is derived from thermodynamics and extended to quantum measurements and neural network training, delineating the trade-off between entropy gain and recoverable information.
  • The law underpins empirical scaling laws in statistical mechanics and machine learning, offering actionable insights for designing systems and addressing data privacy challenges.

The Entropy-Memorization Law (EM-Law) encompasses a set of rigorous, quantitative relationships that govern the interplay between entropy, structure, information retention, and memorization. Originally derived in statistical thermodynamics to clarify ambiguities around residual entropy and the third law, the EM-Law framework has emerged across physics, quantum information, neural computation, and large-scale machine learning. In all domains, the law formalizes how constraints, representations, or data regularities are “memorized” as enduring entropy features—whether manifested as irreducible disorder in materials, information loss in quantum measurements, scaling laws in model training, or sharp boundary effects in neural generative decoding.

1. Formal Foundations and Thermodynamic Origin

The EM-Law was first introduced as a refinement to the thermodynamic third law to resolve long-standing paradoxes concerning residual entropy in systems with internal constraints, such as glasses, random alloys, or defect crystals (Shirai, 2018). Each equilibrium state is characterized by a set of thermodynamic coordinates q=(q1,,qm)\vec{q}=(q_1,\dots,q_m), with entropy a well-defined function S=S(q)S = S(\vec{q}). Internal constraints (e.g., rigid barriers, frozen atomic sites) fix additional “frozen” coordinates r^\hat{r}, partitioning systems into distinct thermodynamic classes CA\mathcal{C}_A, CB\mathcal{C}_B. Classes separated by frozen coordinates lack a one-to-one mapping of all state variables.

Within a single class, the zero of entropy at T0T\to 0 is unique and shared. However, across classes separated by a frozen coordinate, the entropy offset S0ABS_0^{AB} (residual entropy) is the memorized difference, encoded irreversibly when constraints are lifted:

S0AB=S0AS0B=sr(rA)sr(0)S_0^{AB} = S_0^{A} - S_0^{B} = s_r(r_A) - s_r(0)

where sr(r)s_r(r) is the entropy contribution of the frozen coordinate. The process of unfreezing rr irreversibly reconstructs the entropy origin, observed as residual entropy, and constitutes the law’s eponymous “memorization” of system history.

The EM-Law rigorously connects all residual entropy phenomena—configurational, orientational, and defect-based—to a unified, constraint-driven mechanism, eliminating the need to invoke metastability or kinetic irreversibility. For example, in a binary alloy S=S(q)S = S(\vec{q})0, the mixing entropy per site at S=S(q)S = S(\vec{q})1,

S=S(q)S = S(\vec{q})2

is the signature of class transition and memorization of composition labels (Shirai, 2018).

2. Information-Theoretic and Quantum Generalizations

Quantum information theory makes the EM-Law explicit as a trade-off between entropy gain and retrievable information during measurements (Wang, 2019). For a quantum state with density matrix S=S(q)S = S(\vec{q})3 subject to a measurement (or any irreversible channel) inducing a transition S=S(q)S = S(\vec{q})4, the von Neumann entropy increases:

S=S(q)S = S(\vec{q})5

The retrievable (unlost) fraction of the original information is S=S(q)S = S(\vec{q})6, while the irretrievable fraction is S=S(q)S = S(\vec{q})7. This yields the core EM-Law identity for quantum systems:

S=S(q)S = S(\vec{q})8

In pure-to-mixed measurement (e.g., an S=S(q)S = S(\vec{q})9-qubit system projected onto basis states), the entropy gain r^\hat{r}0 and r^\hat{r}1, signifying exponential decay of recoverable information with entropy production.

The law applies equally to entangled states: for a maximally entangled Bell pair, local measurement produces entropy r^\hat{r}2 and r^\hat{r}3, independent of the chosen basis. In multipartite generalizations (GHZ and W states), the fractional information loss upon partial measurement follows precisely from the corresponding entropy gain, giving a universal characterization of quantum information destruction and retrieval (Wang, 2019).

3. Statistical, Dynamical, and Constraint-Induced Memorization

In first-principles statistical mechanics, the EM-Law is further sharpened: entropy is not a monotonic function but a stochastic variable described by a probability distribution r^\hat{r}4. The system's constraints r^\hat{r}5 (geometry, energy, access rules) shape this long-time entropy distribution r^\hat{r}6 by modulating the set of accessible macrostate volumes r^\hat{r}7 (Peng, 17 Feb 2026).

The law’s formal statement:

  • The entropy distribution r^\hat{r}8 encodes a lasting memory of constraints via r^\hat{r}9.
  • Any change in CA\mathcal{C}_A0 that modifies CA\mathcal{C}_A1 non-uniformly transforms CA\mathcal{C}_A2 structurally, except if all volumes scale by a common factor (resulting only in a translation in CA\mathcal{C}_A3).
  • This “memory” is permanent: even under time-reversal invariant dynamics, the shape of CA\mathcal{C}_A4 evidences past constraints.

Illustrative examples include gas partitioning (memory of wall presence/absence) and double-well potentials (memory of barrier height). Only uniform rescaling of accessible phase-space volumes preserves the shape of CA\mathcal{C}_A5; all other constraint changes imprint persistent statistical structure on the entropy distribution (Peng, 17 Feb 2026).

4. Neural Representation, Memorization, and Generalization

In neural networks and LLMs, the EM-Law governs both the statistical cost of memorization and the generalization capabilities of learned representations:

  • Representation entropy CA\mathcal{C}_A6 (Shannon or matrix-based) controls the generalization gap via

CA\mathcal{C}_A7

where CA\mathcal{C}_A8 is the number of training samples, and CA\mathcal{C}_A9 quantifies the internal representation entropy (Yu, 13 May 2025).

  • Alternating cycles of memorization (lowering cross-entropy, often increasing CB\mathcal{C}_B0) and compression (minimizing CB\mathcal{C}_B1 at the cost of cross-entropy slack) emerge naturally during training. Gated-Phase Transition (GAPT) algorithms can explicitly orchestrate these cycles, achieving improved test loss, out-of-distribution performance, and disentanglement of conflicting memories (Yu, 13 May 2025).

Information-theoretic analyses reveal a dichotomy in memorization patterns: shortcut (heuristic) memorization leads to low entropy and high mutual information between neural activations, while example-level memorization manifests as high entropy and low inter-neuron mutual information (Bansal et al., 2022). Monitoring activation entropy and MI allows robust, unlabeled detection of memorization regimes and supports more reliable model selection.

5. Entropy–Memorization Boundary Effects in Generative Models

Recent work demonstrates that in generative LLMs, memorized and unmemorized output segments are sharply separated by a discontinuity in decoding entropy (Chen et al., 2024). Formally:

  • For each token CB\mathcal{C}_B2,

CB\mathcal{C}_B3

  • At the memorization boundary CB\mathcal{C}_B4, a significant entropy jump CB\mathcal{C}_B5 nats is observed:

CB\mathcal{C}_B6

  • This sharp transition is stable across model scales and provides an empirical law to detect memorized (training-data) continuations versus novel generation.

Practically, low-entropy plateaus in generated text signal verbatim memorization and are exploitable for privacy or IP risk monitoring. Model-induced entropy regularization can be applied to mitigate unintended memorization without sacrificing generative diversity (Chen et al., 2024).

6. Scaling Laws, Memorization Difficulty, and Data Privacy

Empirical studies reveal the difficulty of data memorization in LLMs scales nearly linearly with sequence entropy (Huang et al., 8 Jul 2025). Using the token-level edit distance CB\mathcal{C}_B7 as a memorization score, averaged entropy CB\mathcal{C}_B8 over all examples with distance CB\mathcal{C}_B9 fits:

T0T\to 00

with T0T\to 01, T0T\to 02, and Pearson T0T\to 03. This law holds even in human-perceived “gibberish” strings; these exhibit lower token entropy than typical text and are more easily memorized by LLMs due to tokenization effects. The same entropy-memorization law enables dataset inference attacks (EMBEDI): by regressing the entropy-memorization line from model generations, one can accurately distinguish member and non-member datasets in an unsupervised fashion, exposing both privacy and provenance vulnerabilities (Huang et al., 8 Jul 2025).

7. Memory Complexity and the Phase Transition in Entropy Estimation

The EM-Law additionally characterizes the scaling of memory requirements for entropy estimation in finite-state computational models observing i.i.d. sequences (Berg et al., 2024):

T0T\to 04

for moderate accuracy T0T\to 05, but

T0T\to 06

as T0T\to 07. Once fine accuracy is required, all distinct symbols must be essentially “memorized,” marking a sharp phase transition in estimator complexity. This law applies equally to mutual information estimation with a corresponding bivariate alphabet scaling.

This scaling law has far-reaching implications for stream data analytics, network monitoring, and online learning, as it precisely quantifies the tradeoff between memory resources, estimation accuracy, and the inherent necessity of memorizing data support at fine scales (Berg et al., 2024).


Across disciplines, the Entropy-Memorization Law provides a unifying principle for the quantification, detection, and management of memory, information loss, and constraint-induced structure—from physical systems and quantum measurement, through high-dimensional machine representations, to practical generative models and privacy diagnostics.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Entropy-Memorization Law.