Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 171 tok/s
Gemini 2.5 Pro 47 tok/s Pro
GPT-5 Medium 32 tok/s Pro
GPT-5 High 36 tok/s Pro
GPT-4o 60 tok/s Pro
Kimi K2 188 tok/s Pro
GPT OSS 120B 437 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Event-Based LSTM Models

Updated 14 November 2025
  • Event-based LSTM models are recurrent neural architectures designed to handle sparse, irregular data through modified gating mechanisms and event encoding.
  • They incorporate diverse variants like Branched ConvLSTM, Phased LSTM, and Spiking LSTM to address challenges in video event detection, financial forecasting, and neuromorphic computing.
  • Their specialized training and optimization approaches yield high accuracy, faster convergence, and enhanced energy efficiency across various application domains.

An Event-Based Long Short-Term Memory (LSTM) model is a recurrent or spatiotemporal neural network architecture designed to operate on sequences of sparse, irregular events or to detect and characterize meaningful discrete occurrences ("events") within continuous data streams. These models extend conventional LSTM or ConvLSTM mechanisms through architectural modifications, gating mechanisms, and/or event-driven neural encoding tailored to domains such as asynchronous sensors, video event detection, financial time-series, and spiking neuromorphic computation.

1. Architectural Principles of Event-Based LSTM Models

Event-based LSTM models span a variety of architectures, each attuned to handling sparse, temporally irregular, or domain-specific event cues.

  • Branched ConvLSTM for Unsupervised Event Detection: As presented in Phan et al. (Phan et al., 2017), the model comprises three convolutional LSTM branches:
    • Encoding branch (E): Learns regular dynamic patterns from input video frames.
    • Event-detection branch (Ev): Models rare, unpredictable events using a pair of forward and backward ConvLSTMs, whose outputs are merged and post-processed (down-sampled, softmax, max-pooled, up-sampled).
    • Reconstruction branch (R): Combines encoding and event features to reconstruct future frames for unsupervised learning.
  • Phased LSTM: Introduced by Neil et al. (Neil et al., 2016), each unit integrates a learnable oscillatory time gate, enabling updates only during specific "open" windows aligned to events in continuous time. This approach decouples computation intervals from fixed timesteps and facilitates direct ingestion of asynchronous or multi-rate inputs.
  • Event-Driven LSTM for Time-Series Forecasting: In Hossain et al. (Qi et al., 2021), event-driven LSTMs operate on feature vectors constructed via explicit event extraction (e.g., ZigZag price pivots, moving average crossovers), focusing learning on periods of regime change.
  • Spiking LSTM Variants: Event-based architectures such as Spiking-LSTM (Rezaabad et al., 2020) and LSTM-LIF (Zhang et al., 2023) marry spiking neuron models with LSTM-style long-term memory and gating. Spiking-LSTM retains explicit forget/input/output gating expressed with hard-threshold activations, whereas LSTM-LIF leverages coupled dendritic and somatic compartments, eschewing classical LSTM gates.

A summary of distinct event-based LSTM model variants:

Model Type Key Mechanism/Extension Data Type
Branched ConvLSTM Unsupervised event branch + encoding branch Video
Phased LSTM Oscillatory time gate for sparse updates Asynchronous
Event-Driven LSTM Feature engineering for event-based windows Time-series
Spiking LSTM All-or-none spike gating, surrogate-gradient training Spike trains
LSTM-LIF (Two-compartment) Memory via dendritic/somatic compartments Spiking/neuromorphic

2. Formalization, Gating Dynamics, and Event Encoding

The mathematical formulations underpinning event-based LSTM models generalize standard LSTM recurrences with mechanisms to align updates with events:

  • ConvLSTM Gating (Phan et al.): For each time step tt

it=σ(Wxixt+Whiht1+Wcict1+bi) ft=σ(Wxfxt+Whfht1+Wcfct1+bf) ct=ftct1+ittanh(Wxcxt+Whcht1+bc) ot=σ(Wxoxt+Whoht1+Wcoct+bo) ht=ottanh(ct)\begin{aligned} i_t &= \sigma(W_{xi}*x_t + W_{hi}*h_{t-1} + W_{ci}\circ c_{t-1} + b_i) \ f_t &= \sigma(W_{xf}*x_t + W_{hf}*h_{t-1} + W_{cf}\circ c_{t-1} + b_f) \ c_t &= f_t\circ c_{t-1} + i_t \circ \tanh(W_{xc}*x_t + W_{hc}*h_{t-1} + b_c) \ o_t &= \sigma(W_{xo}*x_t + W_{ho}*h_{t-1} + W_{co}\circ c_t + b_o) \ h_t &= o_t \circ \tanh(c_t) \end{aligned}

where * is 2D convolution, \circ is the Hadamard product.

  • Phased LSTM Time-Gate: Each unit samples a continuous phase ϕt\phi_t and time-gate value ktk_t:

ϕt=(ts)modττ,kt={2ϕtron0ϕt<ron/2 22ϕtronron/2ϕt<ron ϵronϕt<1\phi_t = \frac{(t-s) \bmod \tau}{\tau}, \quad k_t = \begin{cases} \frac{2\phi_t}{r_{\mathrm{on}}} & 0 \leq \phi_t < r_{\mathrm{on}}/2 \ 2 - \frac{2\phi_t}{r_{\mathrm{on}}} & r_{\mathrm{on}}/2 \leq \phi_t < r_{\mathrm{on}} \ \epsilon & r_{\mathrm{on}} \leq \phi_t < 1 \end{cases}

Only when ktk_t is nonzero does the LSTM cell-state and hidden-state update.

  • Event-feature Engineering: In event-driven forecasting, only windows aligned to detected events (ZigZag pivots, MA crossovers, etc.) are considered, reducing sequence noise and volume, focusing the model on rare but meaningful transitions.
  • Spiking LSTM Event Encoding: Inputs are Poisson spike trains; LSTM gating is implemented via hard-threshold nonlinearity:

sn(t)={1,un(t)θ, 0,otherwise.s_n(t) = \begin{cases} 1, & u_n(t)\geq\theta, \ 0, & \text{otherwise.} \end{cases}

and all cell state updates are propagated multiplicatively gate-by-gate as in analog LSTM, after thresholding.

3. Training Methodologies and Optimization Procedures

Event-based LSTM models implement specialized training regimes to accommodate their architectures:

  • Unsupervised ConvLSTM Model: Trained with per-pixel cross-entropy reconstruction loss of future video frames; event detection is entirely unsupervised, using only raw videos and data augmentation (random flips, rotations). RMSProp optimizer with learning rate 10310^{-3}, Xavier weight initialization.
  • Phased LSTM: Standard backpropagation through time, but with time-gate masking. Only updates within open windows; periods τ\tau, open ratios ronr_{\mathrm{on}}, and phase shifts ss are learned. Time-gate leak ϵ\epsilon ensures nonzero gradients everywhere.
  • Event-Driven LSTM for Financial Data: Supervised regression to future retracement price using MSE, RMSE, MAE, MAPE losses. Event-aligned sliding window extraction enables sharp reduction in noise and data redundancy. Adam optimizer (β1=0.9\beta_1=0.9, β2=0.999\beta_2=0.999, lr=10310^{-3}).
  • Spiking LSTM and LSTM-LIF: Trained with BPTT using surrogate gradients to handle non-differentiable spike functions. Surrogates are Gaussian with spreads α1,α2\alpha_1,\alpha_2 matched to LSTM analog derivatives. All parameters, including compartment couplings and thresholds, are learned.

4. Empirical Evaluation and Performance Benchmarks

Event-based LSTM variants demonstrate competitive or superior accuracy, faster convergence, and energy efficiency on a range of tasks:

  • Branched ConvLSTM (Phan et al.): On cell-division event detection (BAEC phase-contrast videos), unsupervised ConvLSTM achieved F1F_1-score of $0.735$ (±3\pm3-frame tolerance), outperforming or matching supervised HCRF baselines and approaching fully supervised ConvLSTM ($0.765$). Generalizes well cross-video without retraining.
  • Phased LSTM: On the N-MNIST event-based vision task, achieves 97.3%97.3\% accuracy (CNN baseline 95.0%95.0\%), using \sim5% of the updates per time series compared to standard LSTM. Converges to >90%>90\% accuracy in one epoch for frequency discrimination tasks. Audio-visual lipreading converges faster and yields higher accuracy than conventional LSTM.
  • Event-Driven LSTM (Forex): Best model (EUR/GBP, n=30n=30) achieved MSE 0.006×1030.006\times10^{-3}, RMSE 2.407×1032.407\times10^{-3}, MAPE 0.194%0.194\%, exceeding standard RNNs and other sequence models on event-driven windows.
  • Spiking LSTM: On sequential MNIST, Spiking-LSTM attains 98.23%±0.07%98.23\%\pm0.07\% vs. 99.10%99.10\% for analog LSTM, and 83.75%±0.15%83.75\%\pm0.15\% on EMNIST. Word-level and character-level language modeling tasks yield perplexity close to conventional RNNs.
  • LSTM-LIF: Achieves 98.8%98.8\% on S-MNIST (vs. <90%<90\% for LIF SNNs, 95.6%95.6\%-ALIF), 94.1%94.1\% on GSC. Inference passes are 100×\sim 100\times more energy-efficient than analog LSTM.

5. Application Domains and Adaptation Strategies

Event-based LSTM models are adapted for diverse domains where sparsity, temporal irregularity, or rare events preclude the use of standard LSTMs:

  • Unsupervised Video Event Detection: Three-branch ConvLSTM models can be transferred directly to other spatiotemporal domains by re-tuning spatial/temporal windowing and reconstruction granularity. The event-detection branch can be supervised if labels are available.
  • Asynchronous Sensor Fusion: Phased LSTM enables direct integration of multi-rate sensor data with minimal synchronization overhead, thus well-suited for neuromorphic vision, wearable sensors, and robotics.
  • Sparse, Event-Driven Forecasting: Feature engineering (ZigZag pivots, crossovers) generalizes to any time-series with domain-specific event markers (e.g., traffic spikes, telemetry outliers). The same two-layer LSTM or GRU structure is usable as a baseline for such tasks.
  • Spiking Neuromorphic Computing: Both Spiking-LSTM and LSTM-LIF allow deployment on neuromorphic hardware. The latter, with two-compartment state, provides enhanced memory without introducing explicit digital gates, and can be implemented with minimal energy overhead.

6. Limitations, Open Questions, and Future Directions

Event-based LSTM models exhibit several constraints and avenues for further investigation:

  • Time-Gate Hyperparameters: Careful tuning of oscillation periods and open ratios is essential. Extremely low open ratios (ron1r_{\mathrm{on}}\ll 1) risk starvation of updates.
  • Supervision Levels: Unsupervised event-detection models require post hoc heuristic mapping of event classes; availability of labels enables direct supervision but alters learning dynamics.
  • Scalability: While event-driven models drastically reduce computational load on sparse sequences, their advantage diminishes on fully dense, regularly sampled data.
  • Gradient Propagation: For spiking variants, surrogate gradients must be designed to match LSTM-like temporal dynamics. Two-compartment SNNs provably mitigate vanishing gradients, but may require additional hyperparameter sweeps for optimal coupling and reset magnitude.
  • Generalization to Unlabeled Modalities: Direct adaptation to new domains presumes the existence of reliably detectable events or transitions; the effectiveness of unsupervised event-branched ConvLSTM in unstructured environments is an open problem.

Event-based LSTM models represent a broad class of temporal neural architectures tailored to sparse, irregular, or content-driven sequences, encompassing unsupervised ConvLSTM for rare event detection (Phan et al., 2017), oscillatory time-gated Phased LSTM for continuous-time event processing (Neil et al., 2016), event-driven feature engineering for forecasting (Qi et al., 2021), and spiking-neuron LSTM hybrids for neuromorphic computing (Rezaabad et al., 2020, Zhang et al., 2023). Each variant demonstrates unique strengths across precision, convergence, energy efficiency, and adaptability to asynchronous or low-resource environments.

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Event-Based Long Short-Term Memory Model.