Papers
Topics
Authors
Recent
2000 character limit reached

Fading Memory Property (FMP) in Dynamical Systems

Updated 7 January 2026
  • Fading Memory Property (FMP) is a characteristic of causal, time-invariant systems where the influence of past inputs decays over time, ensuring stability and robust performance.
  • It is formalized using weighted norms and continuity in the product topology, linking contraction dynamics with convolution representations in both linear and nonlinear systems.
  • FMP underpins key architectures in reservoir computing, recurrent neural networks, and physical models, facilitating universal approximation and efficient signal processing.

The fading memory property (FMP) is a rigorously formalized feature of causal, time-invariant systems—both in discrete and continuous time—characterizing systems whose current outputs depend on their past inputs in such a way that the influence of remote past inputs decays, often exponentially, with temporal distance. FMP underpins foundational results in reservoir computing, state-space modeling, recurrent neural network theory, signal processing, and certain classes of physical transport models. Systematically, it links topological continuity, contraction dynamics, kernel representations, and universality in nonlinear approximation. In linear systems, FMP is equivalent to the existence of a convolution representation with an ℓ¹-summable kernel. In nonlinear systems, it often manifests as continuity in the product topology on infinite sequence spaces or as explicit contractivity conditions.

1. Formal Definitions and Core Mathematical Structure

The formal definition of FMP is context dependent but shares a common theme across digital signal processing, RNNs, functional analysis, and control:

  • Weighted Norm Definition (Discrete Time): Given left-infinite input sequences z=(,z2,z1,z0)(Rn)Zz=(\dots, z_{-2}, z_{-1}, z_0) \in (\mathbb{R}^n)^{\mathbb{Z}_{-}}, a weighting sequence w:N(0,1]w:\mathbb{N}\rightarrow(0,1] is strictly decreasing with limtwt=0\lim_{t\to\infty}w_t=0. The associated norm is zw=supt0ztwt\|z\|_w = \sup_{t\leq 0}\|z_t\| w_{-t}. A causal, time-invariant filter UU has FMP if

U:((Rn)Z,w)((RN)Z,w)U: ((\mathbb{R}^n)^{\mathbb{Z}_{-}}, \|\cdot\|_w) \rightarrow ((\mathbb{R}^N)^{\mathbb{Z}_{-}}, \|\cdot\|_w)

is continuous for some weighting sequence ww (Grigoryeva et al., 2018).

  • Topological Characterization: On bounded input sets, all such weighted norms induce the product topology, so FMP is equivalent to continuity in the product topology and does not depend on the precise norm (Grigoryeva et al., 2018, Ortega et al., 2024).
  • Operator-Theoretic/Finite Window View: For any ϵ>0\epsilon > 0 there exists L<L < \infty so that, for two input sequences u,uu, u' agreeing on the most recent LL steps, the outputs satisfy ytyt<ϵ\|y_t - y_t'\| < \epsilon (Zancato et al., 2024).

FMP is central in the formal theory of RNNs, where it is equivalent (under compactness) to the echo-state property (ESP) and both state-forgetting and input-forgetting properties (Ortega et al., 26 Aug 2025). In kernel methods, FMP corresponds to continuity in a suitable weighted LpL^p norm on the space of past inputs (Huo et al., 2024).

2. Analytical and Topological Hierarchies of Fading Memory

FMP admits a precise hierarchy:

  • Minimal Continuity: The system’s response to a localized impulse in the input is continuous (Ortega et al., 2024).
  • Minimal FMP: Truncating the input history in the far past yields convergence to the full output, i.e., “old” input entries can be neglected without affecting the result (Ortega et al., 2024).
  • Weighted-norm FMP: There exists a weighting sequence so that the output operator is continuous with respect to the weighted norm.
  • Product FMP: Strongest; continuity in the product topology, equivalent to fading memory in all weighted norms for bounded inputs.

In linear, time-invariant cases, these lead to the convolution theorem: FMP is necessary and sufficient for the existence of a convolution kernel representing the system, which is absolutely summable (for appropriate spaces). Proposition: for finite-dimensional output spaces, linearity and FMP are equivalent to the availability of a proper convolution representation with ℓ¹-summable kernel (Ortega et al., 2024).

3. Dynamical Systems, Contractivity, and Echo-State Networks

FMP is fundamentally linked to contractivity in dynamical systems:

  • Reservoir Computing and ESNs: A discrete-time reservoir system xt=F(xt1,zt)x_t = F(x_{t-1}, z_t), yt=h(xt)y_t = h(x_t) exhibits FMP if the reservoir map is a contraction in the state variable. This implies unique existence of solutions (ESP) and continuity in the product topology (Gonon et al., 2020, Grigoryeva et al., 2018, Ortega et al., 26 Aug 2025, Grigoryeva et al., 2019).
  • Universality: Echo State Networks (ESNs) with ESP and FMP form a universal approximating class for all discrete-time fading-memory filters on bounded input spaces: for any such system and arbitrary ϵ\epsilon, there exists an ESN whose induced filter is ϵ\epsilon-close in the supremum norm and has both ESP and FMP (Grigoryeva et al., 2018, Gonon et al., 2020).
  • Spectral Characterization: For linear state-space models xt+1=Axt+Butx_{t+1}=A x_t + B u_t, FMP holds if and only if ρ(A)<1\rho(A)<1 (Schur-stability) (Zancato et al., 2024).

In recurrent neural architectures, the presence of FMP is controlled through the spectral radius of the residual Jacobian or recurrent matrices, with Lyapunov exponents providing a quantitative measure of memory retention or decay (Dubinin et al., 2023).

4. Stochastic and Physical Systems: Fading Memory in Decision, Reservoir, and PDE Models

  • Stochastic Sequential Decision Processes: FMP appears as exponential decay in memory traces (e.g., agent's reward memory in reinforcement with fading memory, where past reward salience vanishes at an exponential rate), influencing optimal policy structure in the limit μ0\mu\to 0 (large memory span) (Xu et al., 2019).
  • Quantum and Nonlinear Reservoirs: Open quantum reservoirs exhibit FMP when the trace distance between two output states associated with inputs differing only in the remote past decays uniformly to zero, enforced via contraction conditions on the family of input-dependent quantum channels (Götting et al., 26 Jan 2025).
  • Viscoelasticity and PDEs: In memory-driven PDEs such as Timoshenko beams, FMP is characterized by decay estimates on the memory kernel (e.g., μ(t+s)Ceδtμ(s)\mu(t+s) \leq C e^{-\delta t}\mu(s)); this gives equivalence between FMP and exponential stability of the generated semigroup (Conti et al., 2013). In contrast, non-fading memory kernels violate this property and lead to fundamentally different dynamical behaviors, such as oscillatory relaxation rather than monotone decay (L et al., 2019).

5. Nonparametric and Signature-Based Representations

Advances in kernel methods and rough path theory provide FMP-enforcing representations:

  • Kernel Regression Formulation: Memory functionals defined on spaces of past inputs with exponentially decaying weights admit universal kernel approximators (typically in RKHSs), provided the set of admissible pasts is compact in the weighted norm. Causality is enforced structurally by the selection of functional domain (Huo et al., 2024, Ortega et al., 2024).
  • Exponentially Fading Memory Signatures: The exponentially fading memory (EFM) signature defines a pathwise feature map, with each term forming a weighted Stratonovich integral whose exponential weight ensures the influence of remote past is exponentially suppressed. The EFM-signature supports universal approximation: any continuous FMP functional can be uniformly approximated by a (finite) linear functional of the EFM-signature (Jaber et al., 4 Jul 2025).

6. Architectural Realizations and Modulation of Memory Span

Contemporary architectures combine fading memory with other forms of memory to trade efficiency and expressivity:

  • State Space Models (SSMs) and Hybrids: Systems such as B’MOJO use input-varying, Schur-stable state-transition matrices to realize FMP, ensuring exponentially decaying sensitivity to history. The span of the fading memory can be tuned by the spectrum of these matrices. Eidetic (non-fading) memory mechanisms are introduced to patch "important" tokens or subsequences, yielding hybrid architectures (Zancato et al., 2024).
  • Gated KalmaNet (GKA): GKA uses test-time ridge regression over the full input history, imposing fading memory through adaptive regularization and input-dependent gating, allowing explicit and tunable control of the memory decay profile (Peng et al., 26 Nov 2025). Chebyshev iteration ensures scalability with provable bounds on the condition number and effective memory span.

A summary table of typical FMP criteria for select systems:

Class FMP Criterion (Abstract) Reference
Linear SSM (LTI) Spectral radius ρ(A)<1\rho(A) < 1 (Zancato et al., 2024)
Reservoir System Reservoir map is contraction in state (Grigoryeva et al., 2018, Grigoryeva et al., 2019)
Quantum Reservoir Lipschitz contraction L<1L<1 for channels (Götting et al., 26 Jan 2025)
Kernel Method Continuity in weighted L2L^2 norm on input history (Huo et al., 2024)
EFM-signature Exponential suppression in tensorized rough-path integrals (Jaber et al., 4 Jul 2025)

7. Broader Implications, Applications, and Counterexamples

FMP serves as a foundational property enabling universal approximation by ESNs, kernel functionals, and signature-based representations, and it undergirds stability analysis in physical and quantum systems. Failure of FMP (e.g., in non-fading memory kernels) leads to non-decaying sensitivity to remote inputs and distinct system behavior, such as persistent oscillations (L et al., 2019).

In modern machine learning, explicit modulation of memory fading is now an architectural design axis: hybrid models balance SSM-style efficiency (guaranteeing FMP) with recency-independent mechanisms for long-long-range recall (Zancato et al., 2024, Peng et al., 26 Nov 2025). Extremely slow decay or loss of contractivity breaks FMP and, in recurrent networks, yields nonuniqueness, instability, or persistent memory, which is often undesirable in sequence processing tasks (Ortega et al., 26 Aug 2025).

FMP thus provides a universal mathematical language for analyzing, controlling, and designing systems across dynamical systems theory, signal processing, recurrent networks, quantum computing, and PDE modeling. Its equivalence to topological continuity under weak assumptions, connection to convolution representations, and compatibility with universal approximation theorems make it central to both theoretical and applied aspects of time-series analysis and recurrent computation.

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Fading Memory Property (FMP).