Parallel Hybrid LSTM-GNN for Stock Prediction
- The paper demonstrates the model’s superior prediction accuracy, achieving a 10.6% improvement over standalone LSTM networks.
- It combines parallel LSTM sequence modeling with GNN-based relational encoding to fuse temporal trends and inter-stock dependencies.
- Extensive experiments on ten large-cap stocks show robust convergence in 40–50 epochs and effective performance in real-time trading scenarios.
A parallel hybrid LSTM-GNN architecture is a composite deep learning system that simultaneously processes time-series data and relational data streams for enhanced stock price prediction accuracy. The architecture integrates Long Short-Term Memory (LSTM) networks—well-suited for modeling sequential dependencies in price movements—and Graph Neural Networks (GNNs), which encode polyadic relationships and nonlinear dependencies among stocks, as defined by correlation and association rule mining. The parallel design allows both analytic streams to operate concurrently, enabling direct fusion of temporal and relational embeddings in a downstream prediction head. Extensive experiments on historical stock datasets indicate a measurable performance advantage over both standalone LSTM networks and traditional neural benchmarks (Sonani et al., 19 Feb 2025).
1. LSTM Branch: Temporal Sequence Modeling
The LSTM component utilizes canonical gating mechanisms to learn representations of price histories per stock. Each time step for input passes through the following update equations:
- Input gate:
- Forget gate:
- Output gate:
- Candidate update:
- Final cell and hidden state:
Using daily closing prices () from 2005–2023 per stock, the raw time series is normalized by min–max scaling: Sliding windows of sequence length (tested: 11, 21) are extracted and fed into stacked LSTM layers (2–3 layers), followed by a dense projection layer to yield embeddings .
2. GNN Branch: Relational Encoding
A stock correlation graph is constructed over stocks (nodes). Edge definition uses both:
- Pearson correlation of daily returns ():
An edge is added if .
- Apriori association rules:
- Edges are added if lift with required support and confidence.
GNN propagation uses a standard Graph Convolutional Network (GCN) with symmetric normalization: where and denotes ReLU in hidden layers.
Initial node features combine normalized closing price, moving averages (50/200 day), recent daily returns, and normalized trading volume. After two GCN layers with 0.5 dropout, final node embeddings are generated.
3. Fusion Strategy: Embedding Concatenation and Prediction
The hybrid model maintains parallel branches:
- Branch 1: LSTM encodes the temporal trajectory for each stock, producing .
- Branch 2: GNN encodes the stock graph, resulting in .
Fusion occurs via vector concatenation: This fused representation passes through two fully connected layers with ReLU, then a linear readout layer for next-day price prediction . Gradient flow from MSE loss is propagated through both branches simultaneously, ensuring end-to-end learning.
4. Training Protocol and Optimization
An expanding window validation protocol is employed:
- Initial training uses two years of data.
- Prediction is conducted one day at a time for a subsequent 50-day out-of-sample period.
- After each prediction, the observed price is appended to the training set and the model is retrained or fine-tuned before the next prediction. This approach emulates a real-time trading environment and prevents data leakage.
Key training specifications:
- Loss function: Mean Squared Error (MSE)
- LSTM batch size: 11
- Optimizer: Adam, learning rate 0.005
- Epochs: 10–50, early stopping (patience 5, min improvement)
Implementation utilizes PyTorch and PyTorch Geometric on a single NVIDIA GTX1080 GPU and Intel i7 CPU, without explicit multi-GPU or model/data parallelism. Computations for both LSTM and GNN branches run in parallel per minibatch.
5. Empirical Performance and Comparative Analysis
Experimental results on ten large-cap stocks (2005–2023) reveal the hybrid model's performance advantage:
| Model | MSE | Relative Reduction |
|---|---|---|
| Hybrid LSTM–GNN | 0.00144 | — |
| Standalone LSTM | 0.00161 | 10.6% worse |
| Linear Regression | 0.00224 | — |
| CNN | 0.00302 | — |
| DNN | 0.00335 | — |
Convergence is typically achieved in 40–50 epochs. Early stopping mitigates overfitting and computational waste. The hybrid architecture demonstrates robustness with exception to specific volatile dates (Nov 10 and Nov 30, 2022).
Figures in (Sonani et al., 19 Feb 2025) substantiate accuracy improvements across all monitored stocks, with bar charts and heatmaps illustrating prediction precision.
6. Parallelism and Scalability Considerations
Although the described implementation does not utilize explicit multi-GPU parallelism, the separation of LSTM and GNN into distinct analytic branches naturally facilitates model-parallel strategies. In a scalable setting, LSTM and GNN computations could be assigned to separate GPUs, with embedding fusion performed via cross-device communication. End-to-end training remains feasible and may benefit from increased resource allocation. This suggests practical extensibility to larger stock universes and continual learning scenarios.
7. Significance and Applicability
The parallel hybrid LSTM-GNN architecture, as rigorously evaluated in (Sonani et al., 19 Feb 2025), demonstrates empirically improved accuracy and robustness in next-day stock price prediction over established baselines. By fusing temporal and relational analysis, the model addresses both per-stock sequential dynamics and nonlinear inter-stock dependencies, offering an adaptable tool for real-time financial analytics. A plausible implication is that similar architectures may generalize to other domains where temporal and relational phenomena are coupled, such as traffic forecasting and resource optimization.