Papers
Topics
Authors
Recent
Search
2000 character limit reached

Reinforcement Learning Pair Trading: A Dynamic Scaling approach

Published 23 Jul 2024 in q-fin.CP, cs.LG, and q-fin.TR | (2407.16103v2)

Abstract: Cryptocurrency is a cryptography-based digital asset with extremely volatile prices. Around USD 70 billion worth of cryptocurrency is traded daily on exchanges. Trading cryptocurrency is difficult due to the inherent volatility of the crypto market. This study investigates whether Reinforcement Learning (RL) can enhance decision-making in cryptocurrency algorithmic trading compared to traditional methods. In order to address this question, we combined reinforcement learning with a statistical arbitrage trading technique, pair trading, which exploits the price difference between statistically correlated assets. We constructed RL environments and trained RL agents to determine when and how to trade pairs of cryptocurrencies. We developed new reward shaping and observation/action spaces for reinforcement learning. We performed experiments with the developed reinforcement learner on pairs of BTC-GBP and BTC-EUR data separated by 1 min intervals (n=263,520). The traditional non-RL pair trading technique achieved an annualized profit of 8.33%, while the proposed RL-based pair trading technique achieved annualized profits from 9.94% to 31.53%, depending upon the RL learner. Our results show that RL can significantly outperform manual and traditional pair trading techniques when applied to volatile markets such as~cryptocurrencies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (39)
  1. Reinforcement learning algorithms: An overview and classification. (pp. 1–7). IEEE.
  2. Bellman, R. (1957). A Markovian Decision Process. Journal of Mathematics and Mechanics, 6, 679–684. URL: https://www.jstor.org/stable/24900506. Publisher: Indiana University Mathematics Department.
  3. Making Financial Trading by Recurrent Reinforcement Learning. In B. Apolloni, R. J. Howlett, & L. Jain (Eds.), Knowledge-Based Intelligent Information and Engineering Systems Lecture Notes in Computer Science (pp. 619–626). Berlin, Heidelberg: Springer. doi:10.1007/978-3-540-74827-4_78.
  4. High-Frequency Trading and Price Discovery. The Review of Financial Studies, 27, 2267–2306. URL: https://doi.org/10.1093/rfs/hhu032. doi:10.1093/rfs/hhu032.
  5. Burgess, A. N. (2003). Using Cointegration to Hedge and Trade International Equities. In Applied Quantitative Methods for Trading and Investment (pp. 41–69). John Wiley & Sons, Ltd. URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/0470013265.ch2. doi:10.1002/0470013265.ch2 section: 2 _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/0470013265.ch2.
  6. David Silver, Demis Hassabis (2016). AlphaGo: Mastering the ancient game of Go with Machine Learning. URL: https://blog.research.google/2016/01/alphago-mastering-ancient-game-of-go.html.
  7. Distribution of the Estimators for Autoregressive Time Series with a Unit Root. Journal of the American Statistical Association, 74, 427–431. URL: https://doi.org/10.1080/01621459.1979.10482531. doi:10.1080/01621459.1979.10482531. Publisher: Taylor & Francis _eprint: https://doi.org/10.1080/01621459.1979.10482531.
  8. Does Simple Pairs Trading Still Work? Financial Analysts Journal, 66, 83–95. URL: https://doi.org/10.2469/faj.v66.n4.1. doi:10.2469/faj.v66.n4.1. Publisher: Routledge _eprint: https://doi.org/10.2469/faj.v66.n4.1.
  9. Cointegration portfolios of European equities for index tracking and market neutral strategies. Journal of Asset Management, 6, 33–52. URL: https://doi.org/10.1057/palgrave.jam.2240164. doi:10.1057/palgrave.jam.2240164.
  10. Arbitrage. In J. Eatwell, M. Milgate, & P. Newman (Eds.), Finance (pp. 57–71). London: Palgrave Macmillan UK. URL: https://doi.org/10.1007/978-1-349-20213-3_4. doi:10.1007/978-1-349-20213-3_4.
  11. Air power’s quest for strategic paralysis. Proceedings of the School of Advanced Airpower Studies, .
  12. Pairs Trading: Performance of a Relative Value Arbitrage Rule. URL: https://papers.ssrn.com/abstract=141615. doi:10.2139/ssrn.141615.
  13. Gold, C. (2003). FX trading via recurrent reinforcement learning. 2003 IEEE International Conference on Computational Intelligence for Financial Engineering, 2003. Proceedings., (pp. 363–370). URL: http://ieeexplore.ieee.org/document/1196283/. doi:10.1109/CIFER.2003.1196283. Conference Name: 2003 IEEE International Conference on Computational Intelligence for Financial Engineering. Proceedings ISBN: 9780780376540 Place: Hong Kong, China Publisher: IEEE.
  14. Soft Actor-Critic Algorithms and Applications. URL: http://arxiv.org/abs/1812.05905. doi:10.48550/arXiv.1812.05905 arXiv:1812.05905 [cs, stat].
  15. Mastering Pair Trading with Risk-Aware Recurrent Reinforcement Learning. URL: http://arxiv.org/abs/2304.00364 arXiv:2304.00364 [cs, q-fin].
  16. Huang, C. Y. (2018). Financial Trading as a Game: A Deep Reinforcement Learning Approach. URL: http://arxiv.org/abs/1807.02787. doi:10.48550/arXiv.1807.02787 arXiv:1807.02787 [cs, q-fin, stat].
  17. Huck, N. (2010). Pairs trading and outranking: The multi-step-ahead forecasting case. European Journal of Operational Research, 207, 1702–1716. URL: https://www.sciencedirect.com/science/article/pii/S0377221710004820. doi:10.1016/j.ejor.2010.06.043.
  18. Optimizing the Pairs-Trading Strategy Using Deep Reinforcement Learning with Trading and Stop-Loss Boundaries. Complexity, 2019, e3582516. URL: https://www.hindawi.com/journals/complexity/2019/3582516/. doi:10.1155/2019/3582516. Publisher: Hindawi.
  19. FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning. URL: http://arxiv.org/abs/2211.03107. doi:10.48550/arXiv.2211.03107 arXiv:2211.03107 [q-fin].
  20. FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance. URL: http://arxiv.org/abs/2011.09607. doi:10.48550/arXiv.2011.09607 arXiv:2011.09607 [cs, q-fin].
  21. FinRL: Deep Reinforcement Learning Framework to Automate Trading in Quantitative Finance. In Proceedings of the Second ACM International Conference on AI in Finance (pp. 1–9). URL: http://arxiv.org/abs/2111.09395. doi:10.1145/3490354.3494366 arXiv:2111.09395 [cs, q-fin].
  22. A Deep Reinforcement Learning Approach for Automated Cryptocurrency Trading. In J. MacIntyre, I. Maglogiannis, L. Iliadis, & E. Pimenidis (Eds.), Artificial Intelligence Applications and Innovations IFIP Advances in Information and Communication Technology (pp. 247–258). Cham: Springer International Publishing. doi:10.1007/978-3-030-19823-7_20.
  23. Mandelbrot, B. (1967). The Variation of Some Other Speculative Prices. The Journal of Business, 40, 393–413. URL: https://www.jstor.org/stable/2351623. Publisher: University of Chicago Press.
  24. Regime-switching recurrent reinforcement learning for investment decision making. Computational Management Science, 9, 89–107. URL: https://doi.org/10.1007/s10287-011-0131-1. doi:10.1007/s10287-011-0131-1.
  25. Reinforcement Learning in Financial Markets. Data, 4, 110. URL: https://www.mdpi.com/2306-5729/4/3/110. doi:10.3390/data4030110. Number: 3 Publisher: Multidisciplinary Digital Publishing Institute.
  26. Playing Atari with Deep Reinforcement Learning. URL: http://arxiv.org/abs/1312.5602. doi:10.48550/arXiv.1312.5602 arXiv:1312.5602 [cs].
  27. Perlin, M. (2007). M of a Kind: A Multivariate Approach at Pairs Trading. URL: https://papers.ssrn.com/abstract=952782. doi:10.2139/ssrn.952782.
  28. Perlin, M. S. (2009). Evaluation of pairs-trading strategy at the Brazilian financial market. Journal of Derivatives & Hedge Funds, 15, 122–136. URL: https://doi.org/10.1057/jdhf.2009.4. doi:10.1057/jdhf.2009.4.
  29. Pricope, T.-V. (2021). Deep Reinforcement Learning in Quantitative Algorithmic Trading: A Review. URL: http://arxiv.org/abs/2106.00123. doi:10.48550/arXiv.2106.00123 arXiv:2106.00123 [cs, q-fin].
  30. Stable-baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research, 22, 1–8.
  31. Enhancing a Pairs Trading strategy with the application of Machine Learning. Expert Systems with Applications, 158, 113490. URL: https://www.sciencedirect.com/science/article/pii/S0957417420303146. doi:10.1016/j.eswa.2020.113490.
  32. Proximal Policy Optimization Algorithms. URL: http://arxiv.org/abs/1707.06347. doi:10.48550/arXiv.1707.06347 arXiv:1707.06347 [cs].
  33. Sharpe, W. F. (1964). Capital Asset Prices: A Theory of Market Equilibrium Under Conditions of Risk*. The Journal of Finance, 19, 425–442. URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1540-6261.1964.tb02865.x. doi:10.1111/j.1540-6261.1964.tb02865.x. _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1540-6261.1964.tb02865.x.
  34. Reinforcement learning: An introduction. MIT press. URL: https://books.google.com/books?hl=en&lr=&id=uWV0DwAAQBAJ&oi=fnd&pg=PR7&dq=info:t8N5xiW9bXoJ:scholar.google.com&ots=mjnJv51Yh6&sig=MCRPFr8I5VMRpz8m8L9PMXdGHk0.
  35. Deep reinforcement learning applied to statistical arbitrage investment strategy on cryptomarket. Applied Soft Computing, 153, 111255. URL: https://www.sciencedirect.com/science/article/pii/S1568494624000292. doi:10.1016/j.asoc.2024.111255.
  36. Improving Pairs Trading Strategies via Reinforcement Learning. In 2021 International Conference on Applied Artificial Intelligence (ICAPAI) (pp. 1–7). URL: https://ieeexplore.ieee.org/document/9462067. doi:10.1109/ICAPAI49758.2021.9462067.
  37. Optimal market-neutral currency trading on the cryptocurrency platform. URL: http://arxiv.org/abs/2405.15461. doi:10.48550/arXiv.2405.15461 arXiv:2405.15461 [cs, q-fin].
  38. Using a Genetic Algorithm to Improve Recurrent Reinforcement Learning for Equity Trading. Computational Economics, 47, 551–567. URL: https://doi.org/10.1007/s10614-015-9490-y. doi:10.1007/s10614-015-9490-y.
  39. Deep Reinforcement Learning for Trading. The Journal of Financial Data Science, . URL: https://www.pm-research.com/content/iijjfds/early/2020/03/16/jfds.2020.1.030. doi:10.3905/jfds.2020.1.030. Company: Institutional Investor Journals Distributor: Institutional Investor Journals Institution: Institutional Investor Journals Label: Institutional Investor Journals Publisher: Portfolio Management Research.

Summary

  • The paper introduces a dynamic scaling approach leveraging reinforcement learning to adaptively manage pair trading in volatile crypto markets, achieving annualized profits between 9.94% and 31.53%.
  • The methodology employs actor-critic RL algorithms with a continuous action space to optimize both trade timing and quantity based on real-time market signals.
  • The RL-based strategy outperforms traditional static models by enhancing profitability and risk control, highlighting its potential for adaptive financial decision-making.

Reinforcement Learning Pair Trading: A Dynamic Scaling Approach

The paper employs RL for pair trading in the volatile cryptocurrency market, introducing a novel RL-based algorithmic trading strategy outperforming traditional methods by leveraging dynamic scaling approaches. In essence, the research adapts trading execution through RL to capitalize on volatility while managing risk inherent in crypto-assets.

Pair Trading in Financial Markets

Pair trading is a popular statistical arbitrage strategy focusing on exploiting price discrepancies between correlated assets. Typically, these strategies are structured to be market-neutral by simultaneously opening a long position on one asset while shorting the other within highly correlated pairs. Traditional approaches using static rules have shown profit generation capabilities but may lag in adaptability to frequent market condition changes (Figure 1). Figure 1

Figure 1: Architecture of Trading Strategies.

Reinforcement Learning in Algorithmic Trading

RL, particularly in the form of Deep RL, has been increasingly used in algorithmic trading due to its capability to learn optimal strategies through interaction with the environment. By modeling financial markets as a Markov Decision Process (MDP), the RL agent makes investment decisions aimed at maximizing profits. Notably, RL's incorporation into trading strategies allows for adaptive timing and quantity decisions, previously unattainable with fixed-rule systems. Among the RL algorithms, those utilizing actor-critic architectures, such as PPO and A2C, are preferred due to better facilitate policy improvements.

Observation and Action Space

The observation space in RL-based pair trading consists of three elements: Position, Spread, and Zone. Each component provides real-time trading signals, which the RL agent interprets to make informed trading decisions (Figure 2). Figure 2

Figure 2: The value of position observation based on investment.

Action space in RL pair trading is continuous, encompassing the decision not only about the timing of trades but also the quantity. This involves deciding the degree of investment based on opportunity quality, thereby scaling trades to the assessed potential of profit.

Dynamic Scaling and Investment Strategies

The primary innovation in this research is the integration of dynamic scaling methods through RL, contrasted against static models of traditional pair trading. Experimentation shows this method produces significantly higher profits, achieving a more robust performance metric across variable market conditions (Figure 3). Figure 3

Figure 3: Prices of BTCEUR and BTCGBP.

Performance Metrics

Experiments demonstrate the RL-based approach yields annualized profits ranging from 9.94% to 31.53%, surpassing the traditional pair trading technique's 8.33% by a substantial margin. The dynamic scalability of investment decisions allows for improved responsiveness to market shifts, thereby optimizing both profitability and risk exposure.

Conclusion

The research successfully introduces a dynamic component to pair trading via RL, providing a scalable approach well-suited to the cryptocurrency market's inherent volatility. The combination of sophisticated RL methods, sophisticated reward shaping, and carefully designed observation/action spaces culminates in a trading strategy capable of outperforming traditional systems. Future work might explore expanding the RL approach to more complex trading strategies, incorporating additional legs or dimensions in trading, or adapting RL models to other volatile markets. This paper signifies a definitive shift towards more flexible, adaptive trading strategies empowered by AI in financial applications.

Paper to Video (Beta)

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We found no open problems mentioned in this paper.

Authors (2)

Collections

Sign up for free to add this paper to one or more collections.

Tweets

Sign up for free to view the 2 tweets with 2 likes about this paper.