Local E-Prop: Synapse-Local Learning
- Local E-Prop is a synapse-local learning rule that approximates true gradient descent by using eligibility traces for efficient credit assignment.
- It propagates learning signals forward in time and across layers, enabling online, real-time updates without global backward passes.
- By reducing memory and computational requirements compared to BPTT and RTRL, local E-Prop facilitates scalable training in deep recurrent architectures.
Local E-Prop (eligibility propagation) is an online, synapse-local learning rule for recurrent neural networks (RNNs) and deep recurrent architectures that approximates the true gradient descent as computed by Backpropagation Through Time (BPTT). Unlike BPTT, which requires global, memory-intensive, and biologically implausible backward passes through temporal (and, in deep stacks, hierarchical) structure, local E-prop propagates credit assignment information both forward in time and across depth using synapse-wise eligibility traces and local learning signals. This approach matches the per-step computational and memory footprint of BPTT, yet remains causal, scalable, and implementable in real time, supporting learning in deep recurrent systems without formulating explicit backward phases (Millidge, 30 Dec 2025).
1. Problem Formulation and Biological Rationale
The canonical recurrent learning problem is to optimize a loss over network outputs generated by an RNN (or deep stack of RNN layers) across steps. BPTT, while exact, requires storage of all intermediate activations and a sequential backward replay—biologically implausible due to non-causality and high memory cost (, with the total state size). Real-Time Recurrent Learning (RTRL) avoids backward passes but at prohibitive compute per step by maintaining and updating high-order sensitivity tensors, making it impractical for large networks (Millidge, 30 Dec 2025). E-prop refines RTRL: it replaces total derivatives with partials, collapsing the sensitivity tensor into one scalar per parameter, reducing complexity to per step, comparable to BPTT, yet remains fully online and local in both space and time.
2. Formal Framework: Eligibility Traces and Local Learning Signals
Local E-prop introduces eligibility traces for every synapse, which integrate the influence of past network states on synaptic parameters using partial derivatives. For a network with layers indexed by , neuron states , weights , and learning signals , the recursion defining eligibility traces is: The first term tracks direct feed-in sensitivity to the weight at time ; the second term propagates recurrent temporal credit within the layer; the third term propagates hierarchical credit from layer (Millidge, 30 Dec 2025).
In layerwise, matrix notation: where , , and .
Each synapse is updated at each time step using:
3. Temporal and Hierarchical Credit Assignment
Classic E-prop algorithms focused on single recurrent layers, yielding eligibility traces that only carry temporal credit. Local E-prop generalizes to deep architectures by extending the eligibility trace recursion across both time and depth. In the unrolled (layer, time) lattice, the pathwise influence of each parameter on the loss is decomposed into direct partials and recurrences spanning adjacent layers (hierarchical) and timesteps (temporal) (Millidge, 30 Dec 2025). Unlike BPTT, which requires explicit backward traversal, local E-prop efficiently forwards information using traces, with no need for global activation storage or reversed computation.
4. Algorithmic Implementation and Synapse-local Computation
All computations required for local E-prop per parameter are strictly local. Every update at synapse uses only:
- The local eligibility trace ,
- The neuron’s own activation and local Jacobians,
- The hierarchical trace from the previous layer,
- The top–down learning signal .
No backward traversal or non-local communication is invoked beyond the one-layer neighborhood. All loops over time and (if computation is non-sequential) layers can be parallelized. The content of the per-step algorithm remains the same for generic RNNs, LSTM networks (with LSTM-specific gating eligibility derivatives), and deep stacked networks, with appropriate Jacobian evaluation (Millidge, 30 Dec 2025, Hoyer et al., 2022).
5. Comparison to BPTT, RTRL, Classical E-Prop, and Variants
| Method | Time per Step | Memory | Locality |
|---|---|---|---|
| BPTT | Non-local (global backward pass) | ||
| RTRL | Local in time, not in space | ||
| E-prop/local | Local in time and space |
parameters, neurons, sequence length (Millidge, 30 Dec 2025, Martín-Sánchez et al., 2022, Hoyer et al., 2022).
E-prop (symmetrical variant) maintains the three-factor Hebbian structure (eligibility, learning signal, plasticity coefficient). In the most local form (“random feedback local E-prop”), the feedback for the learning signal at each unit is supplied by a fixed, random matrix rather than a symmetric transpose, further enhancing spatial locality (Liu et al., 7 Jun 2025).
6. Empirical Performance and Biological Plausibility
On standard sequence learning benchmarks (e.g., sMNIST, permuted sMNIST, synthetic delayed tasks) local E-prop matches BPTT in final accuracy, typically requiring up to more training iterations. Notably, it achieves this performance without retaining the memory footprint or backward-in-time passes of BPTT. For neurophysiological tasks (e.g., Mante 2013, Sussillo 2015), e-prop-trained models exhibit hidden-state geometries and dynamical signatures closely matching those of BPTT when matched for task accuracy. Empirical studies demonstrate that architectural details and parameter initializations have at least as much impact on neural similarity metrics (Procrustes distance, CKA, SVCCA, DSA) as the choice between e-prop and BPTT (Liu et al., 7 Jun 2025). This suggests that, given suitable initialization and model class, the local truncation of the gradient suffices for both task learning and the accurate reproduction of experimentally observed neural dynamics.
7. Extensions to LSTM Architectures and Reinforcement Learning
Local E-prop generalizes to gated architectures such as LSTMs by factorizing gradients with respect to each gate and connection type and defining eligibility trace dynamics specific to each gate. The learning signal can be constructed from output-target discrepancies weighted by either a fixed random matrix (random e-prop) or the (not-learned) transpose of the readout matrix (symmetric e-prop). Further extensions include:
- Forget-gate bias initialization to boost early eligibility trace “carry-over”,
- Trace echo, adding a small, unsupervised Hebbian term to the gradient,
- Trace scaling to stabilize trace magnitudes across different classes of weights,
- Application to deep recurrent Q-learning in RL with a trace-based temporal-difference update.
On some long-delay or sparse-reward tasks, e-prop with these extensions can outperform BPTT, particularly where maintaining long-term dependencies is critical and memory constraints are severe (Hoyer et al., 2022).
References: (Millidge, 30 Dec 2025, Martín-Sánchez et al., 2022, Liu et al., 7 Jun 2025, Hoyer et al., 2022)