- The paper presents a novel engram neural network (ENN) that integrates Hebbian plasticity with RNNs for improved memory trace encoding.
- It employs a differentiable memory matrix updated via Hebbian learning and a sparse attention mechanism to enable selective memory retrieval.
- Evaluation on datasets like MNIST, CIFAR-10, and WikiText-103 demonstrates competitive accuracy alongside enhanced interpretability of memory dynamics.
Hebbian Memory-Augmented Recurrent Networks: Engram Neurons in Deep Learning
Introduction
The paper "Hebbian Memory-Augmented Recurrent Networks: Engram Neurons in Deep Learning" presents the Engram Neural Network (ENN), which incorporates elements from neuroscience to enhance the interpretability and performance of recurrent neural networks (RNNs). By integrating Hebbian plasticity and engram-style memory encoding, the ENN aims to fill the gap between biological memory systems and artificial recurrent models, thus improving transparency in sequential tasks.
Approach
The ENN extends traditional RNN architectures by introducing an explicit memory matrix updated through Hebbian plasticity and retrieved sparingly using a sparse, attention-based mechanism. This approach combines a differentiable memory matrix with a recurrent integration process, aligning closely with biological memory systems.
Key Components of the ENN:
- Memory Matrix (M): An explicit memory structure that allows for content-based addressing, supporting the storage of engram-like trace patterns.
- Hebbian Trace (H): A dynamic trace updated using Hebbian learning rules to simulate synaptic reinforcement, augmenting memory retrieval processes.
- Sparse Attention Mechanism: Ensures selective activation of memory traces using adjustable temperature scaling to control sparsity.
Implementation
The ENN is implemented using TensorFlow 2.19, leveraging its integration with the Keras API for constructing modular and scalable models. The architecture supports various sequence modeling tasks, providing flexibility for both image and text data. The use of efficient GPU-accelerated training pipelines and mixed-precision computation optimizes performance, making the ENN suitable for large-scale applications.
1
2
3
4
5
6
7
8
|
from tensorflow_engram.models import EngramClassifier
model = EngramClassifier(input_shape=(28, 28), num_classes=10, hidden_dim=128,
memory_size=64, sparsity_strength=0.1)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=64, epochs=10, validation_split=0.1) |
Evaluation
The ENN's performance was compared against RNN, GRU, and LSTM models across benchmarks like MNIST, CIFAR-10, and WikiText-103. Results demonstrate that the ENN achieves competitive accuracy while offering interpretability through visualized memory dynamics.
Figures:
Figure 1: Training and validation loss curves for the ENN on dummy sequence data compared to a standard RNN.
Figure 2: Training and validation loss/accuracy curves for the ENN on MNIST.
Figure 3: Confusion matrix for ENN predictions on the MNIST test set.
Figure 4: Heatmap of classification metrics (precision, recall, F1-score) by class for ENN.
Discussion
The ENN architecture combines biological principles with artificial neural networks to enhance interpretability without compromising performance. By monitoring Hebbian trace dynamics, the model offers insights into memory processes and selective recall absent in standard recurrent models. However, challenges remain, such as computational overhead and hyperparameter tuning.
Conclusion
The ENN exemplifies how neurobiological insights can inform the design of adaptable, interpretable neural architectures for sequence learning. While maintaining competitive performance, the model's transparent memory mechanism allows for insightful analysis of memory function in sequential tasks. Future work will explore extensions like adaptive memory dynamics and hybrid models integrating Transformer architectures.
In summary, the ENN offers a compelling alternative to traditional RNNs, particularly in applications prioritizing interpretability and biologically inspired reasoning.