Global-guided Hebbian Learning (GHL)

Updated 5 February 2026

Global-guided Hebbian Learning (GHL) is a framework of three-factor rule systems that blends local Hebbian updates with global modulatory signals for efficient credit assignment.
It addresses traditional Hebbian limitations by incorporating error broadcasts, neuromodulatory signals, and curvature-based feedback to modulate synaptic plasticity.
Empirical studies show that GHL achieves competitive accuracy in image classification, sequential tasks, and continual learning, closely rivaling backpropagation.

Global-guided Hebbian Learning (GHL) refers to a family of learning rules and algorithmic frameworks that integrate local Hebbian plasticity with global modulatory or error signals, thereby enabling scalable, effective, and biologically inspired credit assignment in both artificial and biological neural systems. Traditional Hebbian learning is limited by its strict locality, which impedes efficient credit assignment and continual learning in deep or recurrent networks. GHL addresses these deficits by introducing a “global” guidance component—typically in the form of neuromodulatory, sign-based, broadcast error, or curvature-derived signals—that modulates, gates, or directs local Hebbian updates. This architectural and algorithmic motif yields marked advantages in robustness, sample efficiency, memory retention, and scalability, and appears in diverse instantiations ranging from recurrent plastic networks and cortex-inspired context gating to modern deep convolutional models and attractor memory networks.

1. Theoretical Foundations and Mathematical Formulations

Global-guided Hebbian Learning is characterized by three-factor update schemes. The canonical form is: $\Delta w_{ik} = \eta \cdot G(m_{ik}) \cdot H(x_i, y_k)$ where $H(x_i, y_k)$ is a local Hebbian term, often Oja’s rule $y_k(x_i - y_k w_{ik})$ , $G(m_{ik})$ is a global modulatory factor, and $\eta$ is a learning rate. Instantiations vary in the global term:

In deep CNNs, $G(m_{ik}) = \mathrm{sign}(\partial L/\partial w_{ik})$ couples Hebbian local magnitude with the sign of the backprop gradient, thus aligning plasticity direction to global error minimization (Hua et al., 29 Jan 2026).
In RNNs, $G$ is a learned, neuromodulator-like scalar $\eta(t)$ that gates plasticity magnitude, with the global signal computed from an internal model-generated loss, enabling unsupervised meta-learning (Duan et al., 2023).
In memory networks with inhibition, a scalar $c'$ interpolates between targeted anti-Hebbian and global inhibitory terms, regulating the attractor landscape (Haga et al., 2018).
In continual learning architectures, sluggish context units create a global decaying signal concatenated with local Hebbian context gating, driving orthogonalization via Oja’s rule (Flesch et al., 2022).

Global guidance can be obtained through explicit broadcast of error vectors, as in GEVB (Clark et al., 2021), or implicit geometry via regularization/curvature (Koplow et al., 23 May 2025, Deistler et al., 2018). These approaches all conceptualize GHL as a bridge between strict localism and wholly non-local, backpropagation-based credit assignment.

2. Algorithmic Structures and Practical Implementation

GHL algorithms combine Hebbian eligibility traces with global broadcasts or sign-modulation. A representative procedure in deep networks is as follows (Hua et al., 29 Jan 2026):

Compute conventional local Hebbian update (Oja + competitive/SWTA), e.g., $\Delta w_{ik}^\mathrm{Hebb} = u_k (x_i - y_k w_{ik})$ , $u_k$ softmax activation.
Compute loss $L$ and extract $\mathrm{sign}(\partial L/\partial w_{ik})$ for each synapse.
Apply the update: $w_{ik} \gets w_{ik} + \eta \, \mathrm{sign}(\partial L/\partial w_{ik}) |\Delta w_{ik}^\mathrm{Hebb}|$ .

In RNN frameworks (Duan et al., 2023), the inner loop computes an internal, self-generated loss (no labels), calculates gradients with respect to plastic parameters, and uses a global neuromodulatory rate to interpolate between memory retention and rapid adaptation. All global signals in these systems are efficiently implementable (sign masks, small vector broadcasts, or neural activity means), providing scalable end-to-end training without weight transport or layered symmetry.

In recurrent attractor models, inhibitory plasticity is split into pattern-specific and non-specific (global) terms, with continuous interpolation by a scalar parameter, leading to tunable temporal association spans and stability under sequential loading (Haga et al., 2018).

3. Empirical Results and Benchmark Performance

Global-guided Hebbian Learning methods exhibit state-of-the-art results within biologically motivated learning rule categories and frequently approach conventional backpropagation performance in deep networks:

On CIFAR-10 and CIFAR-100, GHL achieved 86.41% and 62.40% accuracy with DeepHebb architecture, reducing the gap to end-to-end backpropagation to less than 1.2% (Hua et al., 29 Jan 2026).
On ImageNet, GHL on ResNet-50 closed the Top-1 accuracy gap with backpropagation to 2.85% (Hua et al., 29 Jan 2026).
In RNNs for sequential memory and few-shot tasks, GHL frameworks outperformed purely local plasticity and non-plastic baselines. For example, RNN + gradient-based plasticity achieved 50.6% on CIFAR-FS few-shot classification (ResNet-12 encoder), compared to 41.5% for non-plastic networks (Duan et al., 2023).
GEVB learning (error-vector broadcast) trained nonnegative vectorized networks and convolutional architectures on MNIST/CIFAR, matching or outperforming direct feedback alignment and achieving test error within 0.1–1% of backpropagation (Clark et al., 2021).
In continual human-like learning models, GHL replicated human behavioral data on task switching, producing accurate interference-free orthogonal task representations (Flesch et al., 2022).
In associative memory networks, tuning global inhibition robustly increased the span of temporal associations, from $N_c\approx5$ to more than $N_c\approx20$ (Haga et al., 2018).

4. Stability, Convergence, and Biological Plausibility

Stability in GHL is achieved via several mechanisms:

Oja’s normalization constrains Hebbian weight growth.
Competitive updates (SWTA) limit co-adaptation.
Sign-based global signals prevent weight-drift without requiring precise gradient magnitudes (Hua et al., 29 Jan 2026).
Empirical clipping and decay parameters (e.g., $\max_{\mathrm{norm}}$ and decay factors) avert instability in online updates (Duan et al., 2023).

Biologically, GHL maps onto three-factor rules: local pre/post correlations modulated by global neuromodulators (e.g., dopamine acting as a binary or graded switch on plasticity polarity). Global signals can plausibly be realized via diffuse neuromodulatory projections or summed activity signals. The avoidance of weight-transport aligns with observed lack of precise backward connectivity in cortex (Hua et al., 29 Jan 2026, Clark et al., 2021).

5. Connections to Optimization and Network Geometry

Recent work formalizes that global-guided Hebbian dynamics can emerge from, or approximate, conventional optimization procedures:

SGD with weight decay aligns gradient directions with local Hebbian updates near convergence (Koplow et al., 23 May 2025).
The Fisher information matrix provides global, curvature-aware learning rates, which can be estimated locally via spike-statistics and weight magnitude (Deistler et al., 2018).
Sign-based guidance (as in GHL) achieves correct gradient directionality in high-dimensional, infinite-width limits (Hua et al., 29 Jan 2026, Clark et al., 2021).
Regularization and noise can flip Hebbian to anti-Hebbian alignment, further demonstrating the equivalence of global-modulated local learning to non-local error propagation (Koplow et al., 23 May 2025).

These findings bridge the conceptual gap between local Hebbian plasticity and meta-optimization, indicating that “global guidance” may, in effect, be a requirement for scalable, content-aware plasticity in both biological and artificial systems.

6. Application Domains and Extensions

Global-guided Hebbian Learning rules are broadly applicable:

Deep convolutional neural networks and very deep residual architectures (up to 1202 layers) for image classification on CIFAR and ImageNet (Hua et al., 29 Jan 2026).
Recurrent neural networks and LSTMs for rapid memory formation, sequence processing, and few-shot learning (Duan et al., 2023).
Attractor networks implementing long-range sequence binding and associative memory (Haga et al., 2018).
Human-like continual learning and context-dependent task switching via context gating and slow task signaling (Flesch et al., 2022).
Convolutional and vectorized nonnegative networks trained with broadcast error signals (GEVB) (Clark et al., 2021).

Potential extensions include surrogate or reward-based global signals, application to non-canonical architectures (Transformers, spiking neural nets) (Hua et al., 29 Jan 2026), and further formal elucidation of convergence and hardware implementability.

7. Limitations and Open Questions

Despite their strengths, GHL frameworks entail several technical trade-offs:

All current variants require computation of the global signal (typically through a backward pass or activity aggregation), precluding strictly no-backprop hardware implementations (Hua et al., 29 Jan 2026).
Convergence guarantees in non-convex or highly non-stationary regimes remain open, with stability empirically ensured but formally unproven for sign modulation (Hua et al., 29 Jan 2026).
Most proposed global signals are not strictly local, requiring neuromodulatory broadcast or central controller mechanisms; the biological plausibility of precise sign broadcasting at scale warrants investigation (Clark et al., 2021).
Comparison of GHL-learned representations with those from backpropagation and direct probing in neural data are ongoing areas of research (Clark et al., 2021).
Extensions to continual learning regimes with richer context, gating architectures, and replay or regularization strategies are under exploration (Flesch et al., 2022).

The documented flexibility of global-guided Hebbian plasticity suggests a foundational role for such rules in both computational neuroscience and scalable machine learning. Future investigations may elucidate further biological substrates and continue to reduce the performance and scalability gap with backpropagation-based training.