Memory Capacity Neutrality

Updated 28 November 2025

Memory capacity neutrality is the property where a system's effective memory remains invariant despite variations in architecture, resource allocation, or input statistics.
It is pivotal in paging algorithms, neural networks, and reservoir computing, facilitating fair benchmarking and guiding robust algorithmic design.
Neutrality can break down under minor perturbations such as capacity fluctuations or induced correlations, highlighting the need for more discriminative, task-dependent metrics.

Memory capacity neutrality refers to the invariance or robustness of a system’s effective memory—according to a defined operational metric—with respect to variation in architecture, resource allocation, or fluctuating environmental parameters. It is a central analytical construct in paging algorithms, feed-forward and recurrent neural networks, and reservoir computing. Memory capacity neutrality identifies regimes where algorithmic or architectural details are asymptotically irrelevant for memory performance, as opposed to situations where small changes induce drastic departures from theoretical capacity bounds.

1. Formal Models and Definitions

Memory capacity neutrality has discipline-specific formalizations, rooted in operational metrics.

Paging with Dynamic Memory Capacity: Fluctuating main memory is modeled by a request sequence $\sigma$ over pages plus “+” (growth) and “–” (shrink) operations. For a page-access sequence $T$ and a dynamic capacity sequence $\mu = (m_1, m_2, \ldots)$ , and maximal capacity $k$ , the dynamic $(h, k)$ -competitive ratio of online algorithm $\text{ALG}$ is

$\text{ALG}(T, \mu) \leq \rho \cdot \mathrm{OPT}(T, \lfloor (h/k)\mu \rfloor) + d,$

where OPT is a clairvoyant adversary with scaled capacities (Peserico, 2013).

Feed-Forward Neural Networks: The Lossless Memory dimension $D_{\ell m}(|W|)$ and MacKay dimension $D_{mk}(|W|)$ count the number of random-association bits a network (with $|W|$ real parameters) can memorize with zero error, and with 50% probability of success, respectively. Memory neutrality is strict linear scaling: $D_{\ell m} = |W|$ , $D_{mk}=2|W|$ , independent of micro-architecture or nonlinearity details (Friedland et al., 2017).
Linear Echo State Networks (ESN): For $x_t = A x_{t-1} + M u_t$ , with controllability matrix $K = [M, AM, ..., A^{N-1}M]$ , the total memory capacity $MC_{total}=\text{rank}~K$ is neutral to the choice of input mask $M$ if $K$ is full rank (Ballarin et al., 2023).
Nonlinear Recurrent Networks: For $x_t = \phi(A x_{t-1} + C z_t + \xi)$ and i.i.d. Gaussian input $z_t$ , the total memory capacity $MC$ is “neutral” in that, for fixed parameters, it can be forced to take on any value in $[1,N]$ by scaling the input variance $\sigma$ , completely decoupled from the spectral structure of $A$ or $C$ (Ballarin et al., 7 Feb 2025).
Reservoir Computing and Correlated Readouts: Memory-capacity neutrality hypothesizes $MC(N) \propto N$ , but actual scaling is

$MC(N) = \frac{N}{1 + (N-1)\rho},$

where $\rho$ is the mean pairwise neuron correlation. Thus, only for $\rho \to 0$ does neutrality (linear scaling) hold (Takasu et al., 28 Apr 2025).

2. Memory Capacity Neutrality in Paging and Online Algorithms

Classic online paging illustrates both the potential fragility and the robust manifestations of capacity neutrality.

Fragile Optimality: LFRU, a hybrid algorithm optimal under static capacity, can become arbitrarily suboptimal with minimal adversarial capacity fluctuations—a single “wobble” (e.g., alternation between $3m$ and $3m-1$ frames)—showing loss of neutrality (Peserico, 2013).
Robust Classic Policies: LFD (Longest-Forward-Distance) yields minimum page faults irrespective of dynamic changes in capacity: its performance is strictly capacity-neutral (Peserico, 2013). Online marking algorithms (LRU, FWF, MARK, RAND), and dynamically conservative policies (FIFO, CLOCK), remain near-neutral: their dynamic $(h, k)$ -competitive ratios degrade by at most a factor $1 + 1/k$ over the classic static ratio.
Consequences: These properties justify an “exokernel” design: separate allocation of memory quota from page replacement, with page policies being agnostic to quota fluctuations for efficiency and predictability.

3. Linear and Nonlinear Neural Networks: Scaling of Memory Capacity

The theory of memory-capacity neutrality in neural networks is anchored in the information-theoretic Shannon view.

Feed-Forward Networks: Treating parameterized weights as a communication channel (encoder, noisy transmission, decoder) under random classification, each parameter can encode at most one bit of arbitrary association. This yields exact linear scaling: $D_{\ell m} = |W|$ for memorization, independent of exact architecture or non-linearity choice (Friedland et al., 2017).
Empirical Validity: Practical experimental results confirm linear growth in $D_{\ell m}$ and $D_{mk}$ with $|W|$ despite differences in optimizer, activation, or network size. Underfitting due to suboptimal learning algorithms slightly reduces measured capacity but does not alter the scaling.

4. Memory-Capacity Neutrality and the Input Mask in Linear Recurrent Networks

For linear echo state networks, memory capacity neutrality holds with respect to the choice of input mask.

Exact Theoretical Neutrality: If the controllability matrix $K$ is full-rank, then for any choice of mask $M$ , the total memory capacity equals the state dimension, $MC_{total} = N$ . Individual memory capacities $MC_\tau$ for any lag are also invariant to $M$ , depending only on the spectrum of $A$ (Ballarin et al., 2023).
Numerical Implications: Standard estimation methods can violate neutrality due to numerical instability (Krylov subspace squeezing). Subspace-based algorithms (OSM, OSM $^+$ ) preserve neutrality and recover theoretical values to machine precision across all input mask distributions.

5. Neutrality Failure: Nonlinearity, Input Scaling, and Membrane Correlations

Emergent dependencies break memory capacity neutrality under certain perturbations.

Nonlinear RNNs and Arbitrary Tuning: In random nonlinear ESNs, memory capacity is governed solely by input scale $\sigma$ , which can arbitrarily tune $MC$ from $1$ (full saturation: memoryless) to $N$ (fully linear regime). Consequently, $MC$ loses discriminative power as a network property, becoming solely a function of input statistics (Ballarin et al., 7 Feb 2025).
Correlations in Reservoirs: In practical reservoir computing, pairwise neuron correlations $\rho > 0$ , induced by shared input or network mixing, cause $MC(N)$ to grow sublinearly and saturate at $1/\rho$ , rather than scale neutrally with $N$ . This breaks the “one neuron = one memory bit” intuition except when $\rho \approx 0$ (Takasu et al., 28 Apr 2025).

| Scenario | Capacity Growth | Key Parameter | |-------------------------------------|------------------------|-------------------------------| | Independent neurons ( $\rho=0$ ) | Linear ( $MC(N)=N$ ) | None (neutrality achieved) | | Correlated neurons ( $\rho>0$ ) | Sublinear, saturating | Pairwise correlation $\rho$ | | Nonlinear ESN, high input noise | $MC=1$ (min) | Input scale $\sigma$ | | Nonlinear ESN, low input noise | $MC=N$ (max) | Input scale $\sigma$ |

6. Implications and Applications

Performance Benchmarking: In both paging and neural computation, the neutral regime provides settings for fair benchmarking—paging policies or neural architectures can be compared with minimal parameter-induced bias.
Algorithmic Design: For online memory-limited systems, neutrality supports generic logic at the algorithmic layer, with high-level resource control layered above.
Experimental Design: In neural and reservoir settings, understanding the origin and breakdown of neutrality guides architecture selection, input preprocessing (control of input statistics/correlations), and robust memory measurement methodologies.
Metrics for Model Selection: The breakdown of neutrality in nonlinear or correlated regimes emphasizes the need for more sophisticated, task-dependent, and geometry-aware memory measures, as classic MC becomes insensitive or arbitrarily tunable (Ballarin et al., 7 Feb 2025).

7. Open Problems and Advanced Directions

Current capacity metrics exhibit neutrality even in settings where richer computational or biological phenomena are expected, suggesting their incompleteness.

Task-Specific and Geometry-Informed Capacities: Future directions include designing measures that reflect task-specific temporal dependencies, functional complexity, or state-space topology.
Decorrelation and Architectural Interventions: Techniques to suppress neuronal correlations—sparse connectivity, orthogonal weight matrices, independent input perturbations—can be systematically employed to restore or approach neutrality in large-scale reservoirs (Takasu et al., 28 Apr 2025).
Beyond Standard Metrics: Moving from reconstructive capacity towards measures based on nonlinear state observability or information geometry may yield discriminative metrics robust to trivial tuning.

Memory capacity neutrality thus remains a central touchstone, both as a benchmark for robustness and as a signal for deeper metric refinement in machine learning, computational neuroscience, and online algorithmics.