Papers
Topics
Authors
Recent
Search
2000 character limit reached

SpikeATE SNN for Aspect Term Extraction

Updated 17 January 2026
  • SpikeATE is a spiking neural network framework for aspect term extraction that uses event-driven computation and ternary spiking neurons to reduce energy consumption.
  • It employs convolutional spike-encoding layers and surrogate gradient-based backpropagation to effectively model temporal dependencies in sequence labeling tasks.
  • SpikeATE demonstrates 30–40× lower energy use compared to traditional deep neural networks while achieving competitive F1 scores on benchmark datasets.

SpikeATE is a spiking-neural-network (SNN) framework for the task of aspect term extraction (ATE), designed to offer significantly lower energy consumption than conventional deep neural network (DNN) approaches while achieving competitive performance on standard natural language processing benchmarks. SpikeATE employs event-driven computation, ternary spiking neurons, and direct spike-based backpropagation with surrogate gradients to efficiently model temporal dependencies in sequence labeling tasks relevant to sentiment analysis (Mishra et al., 10 Jan 2026).

1. Model Architecture

SpikeATE processes pre-tokenized sentences with a layer-wise architecture optimized for SNN computation. The pipeline is as follows:

  • Input Encoding: Sentences of length RR are embedded as SRB×R×ES \in \mathbb{R}^{B \times R \times E}, where BB is batch size and EE is embedding dimension.
  • Convolutional Spike-Encoding Layer: A trainable 1D convolution (filters WencRC×E×KW^{enc} \in \mathbb{R}^{C \times E \times K}, bias bencRCb^{enc} \in \mathbb{R}^C) projects embeddings into PRB×R×CP \in \mathbb{R}^{B \times R \times C} via

Pi,j,k==0E1m=0K1Si,j+m,Wk,,menc+bkencP_{i,j,k} = \sum_{\ell=0}^{E-1} \sum_{m=0}^{K-1} S_{i,j+m,\ell} \cdot W^{enc}_{k,\ell,m} + b^{enc}_k

Conversion to spike trains over TT discrete time-steps employs Leaky Integrate-and-Fire (LIF) dynamics.

  • Spiking-Convolutional Layers: Three configurable SNN conv layers, each with kernels WLW^L, bias bLb^L:

fconv(P(Spkt))i,j,k=,mP(Spkt)i,j+m,Wk,,mL+bkLf_{conv}(P(Spk_t))_{i,j,k} = \sum_{\ell,m} P(Spk_t)_{i,j+m,\ell} \cdot W^L_{k,\ell,m} + b^L_k

Each layer applies LIF updates.

  • Non-Spiking Decoder: For each time step, spikes SpktL1Spk_t^{L-1} are transformed to logits and class probabilities:

logitst=WnspkSpktL1+bnspklogits_t = W^{nspk} \cdot Spk_t^{L-1} + b^{nspk}

Probclass=t=1TSoftmax(logitst)Prob_{class} = \sum_{t=1}^T Softmax(logits_t)

Token-level class probabilities correspond to the aspect categories (O, B, I).

2. Ternary Spiking Neuron Dynamics

SpikeATE introduces a ternary spiking neuron model, where each unit outputs {1,0,1}\{-1, 0, 1\} spikes. Neuron state variables include synaptic current IsctLIsc_t^L, membrane potential VtLV_t^L, and output spike SpktLSpk_t^L.

  • Synaptic Integration and Voltage Update:

IsctL=WscdL  Isct1L+Wfv+SpktL1++WfvSpktL1 VtL=WvdL  Vt1L+IsctL\begin{align*} \mathrm{Isc}^L_t &= W_{scd}^L \; \mathrm{Isc}^L_{t-1} + \overset{+}{W_{fv}} \overset{+}{\mathrm{Spk}^{L-1}_t} + \overset{-}{W_{fv}} \overset{-}{\mathrm{Spk}^{L-1}_t} \ V^L_t &= W_{vd}^L \; V^L_{t-1} + \mathrm{Isc}^L_t \end{align*}

  • Spike Generation and Reset:

SpktL={+1if VtLVthr 1if VtLVthr 0otherwiseSpk^L_t = \begin{cases} +1 & \text{if } V^L_t \geq V_{thr} \ -1 & \text{if } V^L_t \leq -V_{thr} \ 0 & \text{otherwise} \end{cases}

If SpktL=1|\,Spk^L_t|=1, then VtL0V^L_t \gets 0.

This model allows for richer feature propagation (positive and negative spikes) compared to binary SNNs. Trainable parameters per layer include current and voltage decay rates and signed feedback weights.

3. Training Methodology

  • Supervised Spike-Based Learning: The loss is mean token-level cross-entropy:

Lce=1Ni=1N1Rij=1Ric=021(yi,j=c)log(Probclass(i,j,c))\mathcal{L}_{ce} = -\frac{1}{N} \sum_{i=1}^{N} \frac{1}{R_i} \sum_{j=1}^{R_i} \sum_{c=0}^{2} \mathbf{1}(y_{i,j}=c)\, \log\left(Prob_{class}(i,j,c)\right)

  • Surrogate Gradients: To handle the non-differentiable Heaviside firing, SpikeATE uses an arctangent-based pseudo-gradient:

Gps(V)=α211+(παV2)2,α=2G_{ps}(V) = \frac{\alpha}{2} \frac{1}{1 + \left(\frac{\pi\alpha V}{2}\right)^2}, \quad \alpha=2

Gradients are propagated through time and space (backpropagation through time), with parameters updated for each time-step.

  • Hyperparameters: Batch size 8, learning rate 1e41\mathrm{e}{-4}, Vthr=0.1V_{thr}=0.1, decay parameters Wscd=Wvd=0.1W_{scd}=W_{vd}=0.1, simulation time-steps T=6T=6, trained for $30$–$50$ epochs with early stopping.

4. Temporal Processing and Sequence Modeling

SpikeATE’s convolutional spike encoder preserves local word order via kernel width, while mapping embeddings into event-based spike trains. LIF neuron dynamics propagate temporal dependencies through decayed membrane potentials and synaptic currents, effectively modeling short-term context in token sequences. No explicit time-windowing beyond the TT simulation steps is used; the model dynamics, spike-timing, and recurrent voltage accumulation capture the relevant temporal patterns. This design aligns with SNNs' biological inspiration while enabling practical sequence labeling.

5. Energy Efficiency Analysis

The key claim of SpikeATE is its dramatic reduction in computational energy relative to standard DNNs. Energy consumption per layer is modeled as follows:

  • SNN Layer LL Energy: PowerSNN(L)=77Power_{SNN}(L) = 77 fJ ×SOPs(L)\times SOPs(L), where SOPs(L)=TγFLOPs(L)SOPs(L) = T \gamma FLOPs(L) and γ=NSpk/T\gamma = N_{Spk}/T (mean firing rate).
  • DNN Layer LL Energy: PowerDNN(L)=12.5Power_{DNN}(L) = 12.5 pJ ×FLOPs(L)\times FLOPs(L).

The empirical per-inference energy consumption is tabulated below (Lap14+Res14 average):

Model FLOPs/SOPs (10⁹) Energy (mJ)
BERT-RC/BERT-PT/Self-Train 7.6 (FLOPs only) 95.56
GPT-3.5 3.14×1053.14\times10^5 3.92×1063.92\times10^6
SpikeATE (binary) 0.115 / 0.0059 1.89
SpikeATE (ternary) 0.115 / 0.0149 2.59

SpikeATE delivers more than 30–40× lower energy use than BERT-like models and orders of magnitude below GPT-3.5 (Mishra et al., 10 Jan 2026).

6. Experimental Results

SpikeATE has been evaluated on four SemEval ATE datasets: Lap14, Res14, Res15, and Res16, with key performance results summarized as F1-scores:

Method Lap14 Res14 Res15 Res16
SoftProtoE 83.2 87.4 73.3 77.0
BERT-PT 84.2 86.3 73.9 78.3
Self-Training 86.9 88.8 75.8 82.6
GPT-3.5 83.8 83.8
SpikeATE (binary) 81.4 84.9 70.0 75.3
SpikeATE (ternary) 84.0 86.5 72.3 78.2

The ternary variant achieves F1 scores within 1–2 points of leading DNNs and matches BERT-PT and SoftProtoE on multiple benchmarks.

Ablation studies confirm that ternary spikes consistently improve F1 by 2–5 points over binaryly-encoded spikes across all tasks and time-step choices. Performance peaks at three spiking-conv layers; additional layers confer diminishing or negative returns. Error analyses indicate robust performance on typical aspect terms; boundary failures primarily occur for rare, long multi-token aspects (over 3 tokens).

7. Deployment Characteristics and Extensions

SpikeATE exhibits inference latency around 1.4 ms/sentence (CPU+GPU mixed), and its event-driven model is amenable to deployment on neuromorphic hardware (e.g., Intel Loihi, BrainChip Akida, or FPGAs), offering potential for even lower energy and improved latency.

Future directions outlined for the framework include on-chip training, online test-time adaptation, integration of attention-like mechanisms to improve long-span dependency modeling, augmentation strategies for long aspect phrases, and joint training for aspect plus sentiment extraction. The architecture’s low power and event-based characteristics position it as a strong candidate for sustainable, scalable NLP services (Mishra et al., 10 Jan 2026).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to SpikeATE.