Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Using Reinforcement Learning in the Algorithmic Trading Problem (2002.11523v1)

Published 26 Feb 2020 in q-fin.TR, cs.CE, and cs.NE

Abstract: The development of reinforced learning methods has extended application to many areas including algorithmic trading. In this paper trading on the stock exchange is interpreted into a game with a Markov property consisting of states, actions, and rewards. A system for trading the fixed volume of a financial instrument is proposed and experimentally tested; this is based on the asynchronous advantage actor-critic method with the use of several neural network architectures. The application of recurrent layers in this approach is investigated. The experiments were performed on real anonymized data. The best architecture demonstrated a trading strategy for the RTS Index futures (MOEX:RTSI) with a profitability of 66% per annum accounting for commission. The project source code is available via the following link: http://github.com/evgps/a3c_trading.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Evgeny Ponomarev (4 papers)
  2. Ivan Oseledets (187 papers)
  3. Andrzej Cichocki (73 papers)
Citations (21)

Summary

Overview of Reinforcement Learning in Algorithmic Trading

The paper by Ponomarev et al. presents a detailed exploration of applying reinforcement learning (RL) to algorithmic trading, interpreting trading on the stock exchange as a game with Markov properties. The paper notably employs the asynchronous advantage actor-critic (A3C) method, augmented with various neural network architectures, including recurrent layers, to model trading strategies.

Methodology

The authors formulate trading as a Markov decision process, encapsulating the challenge as one of optimizing portfolio returns. They leverage the A3C algorithm, which has shown efficacy in other domains, to design a system that executes trades based on states, actions, and rewards. The paper explores several neural network architectures, including those with recurrent layers like LSTM, to capture temporal dependencies inherent in financial data. The paper focuses on trading RTS Index futures on the Moscow Exchange, using anonymized historical data.

Key Findings

The experiments demonstrate a strategy with a 66% annual profitability on the RTS Index futures, accounting for trading commissions. This result underscores the potential of RL methods in constructing profitable trading algorithms. Important insights include:

  • Recurrent Layers: The introduction of LSTM layers was found to enhance the performance, emphasizing the importance of capturing temporal dynamics in financial markets.
  • Dropout: The addition of dropout layers improved model robustness, indicating their utility in preventing overfitting.
  • Architecture Complexity: The paper investigates varying complexities in network architecture, showing that more sophisticated models could yield better results.

Numerical Results

Experiments conducted over six months of test data reveal significant profitability, particularly with deeper network architectures and those implementing LSTMs. The Sharpe ratio improvements demonstrate a favorable trade-off between return and risk. Notably, the best architectures were able to leverage market dynamics effectively, even under different transaction cost considerations.

Implications and Future Directions

The findings highlight RL's promise in developing robust trading strategies. Practically, such models could transform automated trading by dynamically adapting to market conditions. Theoretically, this work adds to the growing body of evidence supporting the applicability of advanced reinforcement learning techniques in finance.

Future research could explore:

  • Alternative Reward Structures: Refining reward functions to better align with trader goals could enhance strategy performance.
  • Market Adaptability: Testing strategies across different financial instruments and market conditions would further validate the approach.
  • Hybrid Models: Combining RL with other machine learning paradigms may yield even more refined strategies.

The authors have also provided an implementation environment, facilitating further research and development in this field. These developments could lead to more sophisticated and adaptive trading systems in practice.

Github Logo Streamline Icon: https://streamlinehq.com