Threshold Power-Law RNN Dynamics
- The paper demonstrates that threshold power-law RNNs with p≠1 exhibit inherent scale invariance, enabling identical dynamical behavior regardless of coupling strength g.
- The methodology employs variable rescaling and dynamic mean field theory to systematically compare dynamics across threshold power-law, ReLU (p=1), and sigmoidal RNN models.
- The findings offer practical design guidelines for reservoir computing, ensuring consistent trainability by appropriately rescaling network inputs and outputs.
Threshold power-law recurrent neural networks (RNNs) constitute a class of dynamical neural network models characterized by a transfer function in which unit activations are zero below a specified threshold and rise as a power law above this threshold. These networks, relevant for both reservoir computing and broader recurrent neural modeling, display distinctive dynamical invariances and bifurcations depending on the power-law exponent and the scaling of their recurrent coupling. The qualitative independence of their dynamics and learning performance from coupling strength (with critical exceptions) distinguishes them from classical sigmoidal and rectified linear unit (ReLU) RNNs, thereby providing new perspectives and practical guarantees for the design of untrained reservoirs in machine learning and theoretical neuroscience (Nicola, 30 Nov 2025).
1. Model Formulation and Threshold Power-Law Transfer Functions
The canonical threshold power-law RNN is described for units with internal states , where the dynamics evolve as
with the global coupling constant and a random recurrent connectivity matrix. The transfer function is
with threshold and power . Discrete-time analogues incorporate a leak-rate :
The focus is primarily on the continuous-time model. This class encompasses both classical sublinear, linear (ReLU), and supralinear threshold nonlinearities, allowing systematic exploration of their dynamical effects (Nicola, 30 Nov 2025).
2. The Coupling Constant as a Scale Parameter
For all exponents , the coupling constant is strictly a scale parameter for the system, not influencing qualitative network dynamics. This follows from the homogeneous scaling property of the threshold power-law function: under the change of variables , the dynamics become
independent of . As a result, all dynamical solutions—fixed points, periodic orbits, and chaotic attractors—are mapped onto each other by amplitude rescaling as varies. The qualitative geometry in state space and all stability properties remain invariant across (Nicola, 30 Nov 2025).
3. Dynamics, Chaos, and Absence of "Edge of Chaos" Tuning
Dynamic mean field theory (Kadmon & Sompolinsky, 2015; Omri et al., 2018) establishes that for , chaotic trajectories exist for arbitrarily small nonzero in the limit . In threshold power-law RNNs (), due to scale invariance, any chaotic trajectory at some can be mapped to a corresponding chaotic trajectory at any by the relation
The Lyapunov spectrum, including the maximal Lyapunov exponent , remains constant for all . There is consequently no "edge of chaos" bifurcation with respect to , and chaos can be neither tamed nor forced by tuning in this regime. All dynamical stability types are preserved up to amplitude scaling (Nicola, 30 Nov 2025).
4. The Singular ReLU Case:
When the power-law exponent , producing the ReLU transfer function , the scale transformation that removes is singular. In this scenario,
explicitly retains the coupling strength, and tuning induces genuine bifurcations between dynamical regimes. As passes through critical values, the system transitions between quiescent states, periodic orbits, and chaos. Here, both the untrained network's dynamics and the convergence/performance of training algorithms are directly dependent on the value of . The ReLU case is thus exceptional within the threshold power-law family (Nicola, 30 Nov 2025).
5. Comparison with Classical Sigmoidal RNNs
Traditional RNNs with sigmoidal activations () do not allow elimination of via variable rescaling, as the activation is not homogeneous of any degree. The system,
exhibits qualitatively distinct dynamical regimes as varies: for the network is quiescent, for it becomes chaotic, and the "edge of chaos" near is empirically favored for reservoir computing. Consequently, training metrics, memory capacity, and algorithmic stability hinge nontrivially on precise tuning in sigmoidal networks, in marked contrast with the scale-invariant regime of threshold power-law RNNs for (Nicola, 30 Nov 2025).
6. Theoretical and Practical Implications for Training
A central result states that if a threshold power-law reservoir () can be successfully trained at a single to approximate any supervised signal (using arbitrary encoder and decoder), there exist rescaled encoders and decoders for any guaranteeing identical accuracy. Explicitly, the mapping
preserves the test loss . Thus, if chaos is tamed (training converges) for one , it is universally possible for all , modulo trivial output/input scaling. This result guarantees "no-tuning" of in sub-ReLU () threshold power-law reservoir implementations. Empirical evidence from training FORCED-based networks on oscillator and chaotic targets corroborates this theoretical invariance, with test errors remaining constant across a wide range after rescaling (Nicola, 30 Nov 2025).
7. Summary and Broader Significance
Threshold power-law RNNs with exhibit pure scale invariance in their coupling parameter , leading to invariant dynamical structure and trainability under amplitude rescaling. This property sharply contrasts with both ReLU and sigmoidal RNNs, in which controls stability, chaos, and computational capacity. These findings refine the theoretical understanding of non-sigmoidal RNNs and provide practical design guarantees, especially for reservoir computing, simplifying hyperparameter selection and removing the need to tune the global coupling strength except in the ReLU () regime (Nicola, 30 Nov 2025). A plausible implication is substantial robustness and efficiency for sub-ReLU threshold power-law reservoir architectures in large-scale, untrained recurrent network applications.