Papers

Topics

Authors

Recent

View all

Detailed Answer

Quick Answer

Concise responses based on abstracts only

Detailed Answer

Well-researched responses based on abstracts and relevant paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses

Gemini 2.5 Flash

Gemini 2.5 Flash 91 tok/s

Gemini 2.5 Pro 56 tok/s Pro

GPT-5 Medium 29 tok/s Pro

GPT-5 High 29 tok/s Pro

GPT-4o 108 tok/s Pro

Kimi K2 214 tok/s Pro

GPT OSS 120B 470 tok/s Pro

Claude Sonnet 4 40 tok/s Pro

2000 character limit reached

Applications of deep learning in stock market prediction: recent progress (2003.01859v1)

Published 29 Feb 2020 in q-fin.ST and cs.LG

Abstract: Stock market prediction has been a classical yet challenging problem, with the attention from both economists and computer scientists. With the purpose of building an effective prediction model, both linear and machine learning tools have been explored for the past couple of decades. Lately, deep learning models have been introduced as new frontiers for this topic and the rapid development is too fast to catch up. Hence, our motivation for this survey is to give a latest review of recent works on deep learning models for stock market prediction. We not only category the different data sources, various neural network structures, and common used evaluation metrics, but also the implementation and reproducibility. Our goal is to help the interested researchers to synchronize with the latest progress and also help them to easily reproduce the previous studies as baselines. Base on the summary, we also highlight some future research directions in this topic.

Citations (426)

View on Semantic Scholar

Collections

Summary

The paper presents an extensive survey of 124 studies applying deep learning to forecast stock market trends.
It details a four-step workflow from data collection to model evaluation, emphasizing techniques like imputation, denoising, and feature extraction.
It highlights emerging models such as GANs, GNNs, and hybrid architectures while addressing challenges in result reproducibility.

Stock market prediction is a classical and challenging problem attracting researchers from economics and computer science. The efficient market hypothesis suggests unpredictability, but many studies disagree and aim to build effective prediction models. Historically, this involved fundamental and technical analysis, followed by linear models like ARIMA and GARCH, and traditional machine learning like Logistic Regression and SVM.

With the advent of big data, powerful GPUs, and new neural network architectures, deep learning has emerged as a frontier in this field. This paper surveys recent progress (primarily 2017-2019) in applying deep learning to stock market prediction, aiming to provide an overview of data sources, model structures, evaluation metrics, and practical aspects like implementation and reproducibility. The goal is to help researchers catch up and reproduce prior work.

The survey covers 124 papers, focusing on predicting the close price of individual stocks and market indexes. The prediction task is often framed as either a regression problem (predicting price value) or a classification problem (predicting price movement direction, e.g., up/down). Daily prediction is more common (105 papers) than intraday (18 papers), likely due to data availability. The most studied markets include the US, China, Hong Kong, and Japan.

The prediction process is summarized in a four-step workflow: Raw Data, Data Processing, Prediction Model, and Model Evaluation.

Raw Data

Various data types are used:

Market data: Open/high/low/close prices, volume, etc. Most common, serving as both input features and prediction targets. Provides large sample sizes.
Text data: Social media, news, web searches. Used as alternative data, often processed for sentiment analysis. Second most common.
Macroeconomics data: CPI, GDP, etc. Reflect overall economic health.
Knowledge graph data: Relationships between companies/markets (e.g., supplier-consumer, ownership). Used with Graph Neural Networks.
Image data: Candlestick charts as input images. Less common.
Fundamental data: Quarterly accounting data (assets, liabilities). Less used due to low frequency and reporting delays.
Analytics data: Reports from investment banks/research firms. Rarely used due to cost and sparsity.

The trend shows increased use of diverse data types beyond market data in recent years, indicating diminishing returns from market data alone and the development of new tools for other data types. Data length varies, with intraday studies often using less than a year, while daily studies use longer periods. The "lag" is the length of historical data used for prediction, while the "horizon" is the future time length being predicted (typically short-term, like one day).

Data Processing

Several preprocessing steps are common:

Missing Data Imputation: Handling gaps, especially when combining data with different frequencies. Forward filling is used to avoid data leakage.
Denoising: Techniques like Wavelet Transform or kNN classifiers are used to reduce noise in market data.
Feature Extraction: Transforming raw data into features.
- Market data: Technical indicators (moving average, MACD, RSI) are widely used, sometimes converted into image formats like candlestick charts.
- Text data: NLP techniques are applied. Bag-of-Words (BoW) is a basic approach. Word embeddings (word2vec, GloVe) are popular for representing words as vectors. Event extraction and sentiment analysis (using CNN, NLTK, or custom models like Stock2Vec) are also used to derive features.
- Knowledge graph data: Embedding models like TransE are used.
Dimensionality Reduction: Addressing feature correlation or high dimensionality.
- Transformation: PCA, ICA, autoencoders, RBM, EMD, SMC are used to project data into lower dimensions.
- Feature Selection: Techniques like Chi-square, MRMR, RSAR, ACF, PCF, ANOVA, and MICFS select a subset of relevant features.
Feature Normalization & Standardization: Scaling features to a consistent range (e.g., 0-1, -1-1) or using z-scores to improve model training.
Data Split: Dividing data for training, validation, and testing. Rolling (sliding) windows or successive training sets are common for time series data.
Data Augmentation: Less common than in image tasks, but some methods exist, such as clustering stocks or combining data from correlated companies to increase training samples.

Historical prices and technical indicators are the most frequent input features, followed by text and macroeconomics data.

Prediction Model

Supervised learning is the dominant approach. Models are categorized into standard deep learning models, hybrid models, and other models.

Standard Models:
- Feedforward Neural Networks (FFNN): Includes basic ANN, DNN (MLP), Autoencoders (AE, SAEs), DBN, ELM, DIDLNN, STEFNN, RBFNN. Used early but still prevalent.
- Convolutional Neural Networks (CNN): Adapted from image processing (1D CNN, Inception Module) to extract spatial features from time series or image representations (candlestick charts).
- Recurrent Neural Networks (RNN): Designed for sequential data, including LSTM, GRU, BiLSTM, BGRU. Widely used due to their ability to capture temporal dependencies. Variants incorporate attention mechanisms.
Traditional Baselines: LR, ARIMA, GARCH, Logit, SVM/SVR, kNN are frequently used for comparison. Simple trading strategies (Buy&Hold, Momentum, MACD, RSI, SMA based) also serve as baselines for profitability evaluation.
Hybrid Models: Combine different model types.
- Deep Learning + Traditional ML: Examples include combining RBM with SVM, RNN with AdaBoost, or LSTM with ARIMA.
- Deep Learning + Deep Learning: Common combinations include CNN and RNN structures (e.g., CNN-LSTM) or different RNN types (e.g., LSTM Autoencoder + Stacked LSTM). These aim to leverage the strengths of different architectures.
Other Models: Represent emerging approaches.
- Generative Adversarial Networks (GAN): Used for adversarial training to predict high-frequency data [zhou2018stock], [zhang2019stock].
- Graph Neural Networks (GNN): Utilize relationships between companies/markets encoded in graphs (e.g., GCNN, HGAN, TGC) [chen2018incorporating, feng2019temporal, kim2019hats, matsunaga2019exploring].
- Capsule Network: An alternative to CNN pooling, introduced for text-based stock prediction [liu2019transformer].
- Reinforcement Learning: Trains agents to make trading decisions to maximize cumulative rewards [deng2016deep, lee2019global, xiong2018practical].
- Transfer Learning: Adapting pre-trained models from one set of stocks or market to another [hoseinzade2019u, merello2019ensemble, nguyen2019novel].

RNN models are the most used, though their proportion decreased in 2019 as new model types emerged. Adam is the most popular optimizer. Deep learning models are increasingly used as baselines.

Model Evaluation

Evaluation metrics fall into four categories:

Classification Metrics: For directional prediction: Accuracy, Precision, Recall, F1 score, MCC, AUC, Hit Ratio. Accuracy and F1 are most common.
Regression Metrics: For price value prediction: MAE, MSE, RMSE, MAPE, R². RMSE and MAPE are most common.
Profit Analysis: Evaluating trading strategies based on predictions: Return, Maximum Drawdown, Annualized Volatility, Sharpe Ratio. This provides economic significance beyond prediction accuracy.
Significance Analysis: Statistical tests (Kruskal-Wallis, Diebold-Mariano) to determine if performance differences are statistically significant. Less common but increasing in use.

Implementation and Reproducibility

Implementation: Python is the dominant programming language, with Keras and TensorFlow being the most used deep learning frameworks, followed by PyTorch. GPUs (various NVIDIA models) are crucial for computation. Explicit mention of cloud computing is rare.
Result Reproducibility: Data and code availability are investigated. Free data sources like Yahoo! Finance, Tushare, IMF, World Bank, news websites, and social media APIs are common. Commercial sources and research databases (CSMAR, WRDS) also exist. Data competition platforms like Kaggle serve as data repositories. Public availability of the exact datasets used is limited, but some papers provide links to their data or code repositories (Table 5 and Table 6 in the paper list specific examples). Code availability is even less common than data.

Future Directions

The survey highlights several promising areas for future research:

New Models: Further exploring architectures like Transformer and BERT for text data, or other novel neural networks.
Multiple Data Sources: Combining and effectively integrating more diverse data types beyond market data.
Cross-market Analysis: Leveraging similarities and dependencies between different stock markets using techniques like transfer learning.
Algorithmic Trading: Developing practical trading systems based on predictions, incorporating realistic trading costs, market microstructure effects, and dynamic strategy adaptation, potentially using deep reinforcement learning.

In conclusion, the paper provides a comprehensive overview of deep learning applications in stock market prediction, detailing the typical workflow, prevalent methods, evaluation approaches, and practical considerations. It emphasizes the increasing diversity of data sources and model architectures while pointing out the need for improved reproducibility and more practical algorithmic trading implementations.