Temporal Relational Ranking for Stock Prediction (1809.09441v2)

Published 25 Sep 2018 in cs.CE, cs.IR, and q-fin.GN

Abstract: Stock prediction aims to predict the future trends of a stock in order to help investors to make good investment decisions. Traditional solutions for stock prediction are based on time-series models. With the recent success of deep neural networks in modeling sequential data, deep learning has become a promising choice for stock prediction. However, most existing deep learning solutions are not optimized towards the target of investment, i.e., selecting the best stock with the highest expected revenue. Specifically, they typically formulate stock prediction as a classification (to predict stock trend) or a regression problem (to predict stock price). More importantly, they largely treat the stocks as independent of each other. The valuable signal in the rich relations between stocks (or companies), such as two stocks are in the same sector and two companies have a supplier-customer relation, is not considered. In this work, we contribute a new deep learning solution, named Relational Stock Ranking (RSR), for stock prediction. Our RSR method advances existing solutions in two major aspects: 1) tailoring the deep learning models for stock ranking, and 2) capturing the stock relations in a time-sensitive manner. The key novelty of our work is the proposal of a new component in neural network modeling, named Temporal Graph Convolution, which jointly models the temporal evolution and relation network of stocks. To validate our method, we perform back-testing on the historical data of two stock markets, NYSE and NASDAQ. Extensive experiments demonstrate the superiority of our RSR method. It outperforms state-of-the-art stock prediction solutions achieving an average return ratio of 98% and 71% on NYSE and NASDAQ, respectively.

PDF Abstract

This paper, "Temporal Relational Ranking for Stock Prediction" (Feng et al., 2018 ), addresses the problem of stock prediction by formulating it as a ranking task rather than a traditional classification or regression problem. The authors argue that existing methods predicting price movements or values for individual stocks are suboptimal for the actual goal of investment: selecting the best-performing stocks from a group. Furthermore, these methods often ignore the rich and dynamic relationships between stocks and companies. To overcome these limitations, the paper proposes the Relational Stock Ranking (RSR) framework, which integrates temporal modeling of individual stock data with relation-aware processing using a novel Temporal Graph Convolution (TGC) component.

The core idea is to predict a ranked list of stocks based on their expected return ratio for the next trading day. This contrasts with predicting the exact price or the direction of movement (up/down). The input to the model for a given day $t$ consists of historical time series data for $N$ stocks over the past $S$ days, represented as a tensor $X^t \in \mathbb{R}^{N \times S \times D}$ , where $D$ is the feature dimension per day (e.g., price, moving averages). Additionally, it utilizes a tensor $A \in \mathbb{R}^{N \times N \times K}$ encoding $K$ different types of pairwise relations between stocks.

The RSR framework consists of three main layers:

Sequential Embedding Layer: This layer processes the historical time series data for each stock independently to capture temporal dependencies. The authors choose Long Short-Term Memory (LSTM) networks for this purpose, owing to their ability to model long-term dependencies. For each stock $i$ , the historical data $X_i^t \in \mathbb{R}^{S \times D}$ is fed into an LSTM, and the final hidden state $e_i^t \in \mathbb{R}^U$ is taken as the sequential embedding, where $U$ is the embedding size. The output of this layer is $E^t = [e_1^t, \cdots, e_N^t]^T \in \mathbb{R}^{N \times U}$ .

Implementation Note: A separate LSTM instance could be trained for each stock, or parameters could be shared across all stocks. The paper implies parameter sharing by stating "LSTM cells... depicted in the same layer share the same parameters." This approach is more computationally efficient and allows the model to learn general temporal patterns applicable across different stocks.
Relational Embedding Layer: This is where the stock relations are incorporated in a time-sensitive manner using the proposed Temporal Graph Convolution (TGC). The goal is to revise the sequential embeddings $E^t$ $E^{t}$ to produce relational embeddings $\overline{E^t} \in \mathbb{R}^{N \times U}$ $\overline{E^{t}} \in R^{N \times U}$ that account for the influence of related stocks. The core mechanism is a weighted embedding propagation process:

$\overline{e_i^t} = \sum_{\{j | sum(a_{ji}) > 0\}} \frac{g(a_{ji}, e_i^t, e_j^t)}{d_j} e_j^t$

where $a_{ji} \in \mathbb{R}^K$ is the multi-hot binary vector encoding relations from stock $j$ to stock $i$ , $d_j$ is a normalization term (e.g., degree), and $g(a_{ji}, e_i^t, e_j^t)$ is a time-sensitive relation-strength function. This function estimates the strength of the connection between stock $j$ and $i$ at time $t$ , taking into account their sequential embeddings $e_i^t, e_j^t$ and the relation type $a_{ji}$ . Two variations for $g$ are proposed:
- Explicit Modeling: $g(a_{ji}, e_i^t, e_j^t) = (e_i^t)^T e_j^t \times \phi(w^T a_{ji} + b)$ . This combines a similarity measure between embeddings (inner product) and a learned importance score for the relation type.
- Implicit Modeling: $g(a_{ji}, e_i^t, e_j^t) = \phi(w^T [ (e_i^t)^T, (e_j^t)^T, a_{ji}^T ]^T + b)$ . This uses a feed-forward network with the concatenated embeddings and relation vector to learn the interaction strength. The outputs of $g$ are typically normalized (e.g., using softmax over neighbors) before weighting the propagation. TGC generalizes standard Graph Convolutional Networks (GCNs) by allowing the influence strength between nodes to be dynamic and dependent on the node features (stock embeddings) at a given time step, making it suitable for volatile markets.
Prediction Layer: The sequential embedding $e_i^t$ and the relational embedding $\overline{e_i^t}$ for each stock $i$ are concatenated $[ (e_i^t)^T, (\overline{e_i^t})^T ]^T$ and fed into a fully connected (FC) layer to predict the ranking score $\hat{r}_i^{t+1}$ .

The model is trained using a combined loss function that includes both a pointwise Mean Squared Error (MSE) term and a pairwise max-margin ranking loss term. The MSE term forces the predicted score $\hat{r}_i^{t+1}$ to be close to the ground-truth return ratio $r_i^{t+1}$ . The pairwise term enforces the correct relative order of predicted scores for pairs of stocks, penalizing cases where the predicted order contradicts the ground-truth order. The ground truth $r_i^{t+1}$ is the 1-day return ratio $(p_i^{t+1} - p_i^t) / p_i^t$ , where $p_i^t$ is the closing price on day $t$ .

The data used for experiments includes historical daily closing prices and calculated moving averages for NASDAQ and NYSE stocks from 2013 to 2017. Stocks trading below \$5 or with intermittent data were filtered out. Relation data includes sector-industry classifications and company relations extracted from Wikidata, such as ownership, subsidiary, product/material produced, etc. These relations are represented as a multi-hot vector$a_{ij} $indicating which types of relations exist between companies$ i $and$ j$.

Experiments were conducted using a back-testing strategy over the 2017 data, simulating daily trading decisions based on the model's predictions. Performance is evaluated using MSE, Mean Reciprocal Rank (MRR), and cumulative Investment Return Ratio (IRR).

Key findings from the experiments:

Ranking Formulation (RQ1): Ranking-based methods (Rank_LSTM, RSR) significantly outperform regression-based methods (SFM, LSTM) in terms of IRR, validating the shift in problem formulation towards direct investment goals.
Impact of Stock Relations (RQ2): Incorporating stock relations via TGC (RSR_E, RSR_I) generally leads to higher IRR compared to methods that ignore relations (Rank_LSTM) or use static graph models (GBR, GCN), especially on the NYSE market. This highlights the value of modeling stock interdependencies and the effectiveness of TGC in capturing temporal dynamics in these relations. The choice of relation type matters; industry relations were less effective on the volatile NASDAQ market compared to NYSE.
Back-testing Strategies (RQ3): Buying the single top-ranked stock (Top1 strategy) often yields the highest IRR compared to diversifying across top-5 or top-10 stocks, suggesting the model is relatively good at identifying the single best performer. The achieved IRR with RSR_I is competitive with market indices (S&P 500, DJI) during the tested bullish market period, demonstrating practical applicability.

Implementation considerations for applying this research:

Data Pipeline: A robust data pipeline is needed to collect, clean, and preprocess historical price data and company relation data. Mapping company entities to stock tickers is crucial for integrating relation data.
Relation Extraction: Acquiring comprehensive and accurate relation data is challenging. Public sources like Wikidata are a starting point, but private databases or graph extraction from text (news, filings) might be necessary for richer relations.
Scalability: Training involves processing time series for thousands of stocks and graph convolutions on a large graph daily. This requires significant computational resources, likely distributed computing and multiple GPUs, especially as the number of stocks and relation types grow.
Dynamic Graph: While TGC is time-sensitive, the relation graph structure itself was static in this work (pre-collected relations). Real-world implementations might explore dynamically updating the graph structure or relation types based on events.
Hyperparameter Sensitivity: The performance varied significantly across different runs and hyperparameters. Careful tuning on a robust validation set is essential.
Trading Strategy: The paper uses a simple strategy. A real-world trading system would require integrating this model's output into a more sophisticated portfolio management and risk control framework. The model provides ranking scores, but converting these to specific buy/sell/hold signals, position sizing, and risk limits is an additional engineering challenge.
Market Regimes: The experiments were conducted during a bullish market. The model's performance and optimal strategy might differ significantly in bearish or sideways markets. Continuous monitoring and potential adaptation/retraining are important.
Over-fitting: With many parameters and potentially noisy financial data, overfitting is a risk. Regularization techniques and careful validation are necessary.

Overall, the paper presents a practical deep learning framework for stock ranking that successfully leverages temporal features and dynamic stock relations. The Temporal Graph Convolution component is a key novelty, offering a method to inject structured relational knowledge into sequential modeling in a time-sensitive way. The experimental results demonstrate the potential of this approach to achieve profitable trading performance relative to baselines and market indices, while also highlighting areas for future improvement such as risk management integration and handling diverse market conditions.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Fuli Feng (143 papers)
Xiangnan He (200 papers)
Xiang Wang (279 papers)
Cheng Luo (70 papers)
Yiqun Liu (131 papers)
Tat-Seng Chua (359 papers)

Citations (354)

View on Semantic Scholar

Temporal Relational Ranking for Stock Prediction (1809.09441v2)

Related Papers