Markowitz Meets Bellman: Knowledge-distilled Reinforcement Learning for Portfolio Management (2405.05449v1)
Abstract: Investment portfolios, central to finance, balance potential returns and risks. This paper introduces a hybrid approach combining Markowitz's portfolio theory with reinforcement learning, utilizing knowledge distillation for training agents. In particular, our proposed method, called KDD (Knowledge Distillation DDPG), consist of two training stages: supervised and reinforcement learning stages. The trained agents optimize portfolio assembly. A comparative analysis against standard financial models and AI frameworks, using metrics like returns, the Sharpe ratio, and nine evaluation indices, reveals our model's superiority. It notably achieves the highest yield and Sharpe ratio of 2.03, ensuring top profitability with the lowest risk in comparable return scenarios.
- Volodymyr Mnih et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533, 2015.
- Mastering the game of go with deep neural networks and tree search. Nature, 529(7587):484–489, 2016.
- David Silver et al. Mastering the game of go without human knowledge. Nature, 550(7676):354–359, 2017.
- Continuous control with deep reinforcement learning. arXiv, 2016.
- Deep reinforcement learning for autonomous driving. https://arxiv.org/abs/1811.11329, 2019. Accessed: 2023-11-23.
- Klaus Grobys et al. Title of the article. Journal Name, 2022.
- Deep learning for portfolio optimization. Journal of Financial Data Science, 1(1):34–45, 2019.
- Benchmarking robustness of deep reinforcement learning approaches to online portfolio management. arXiv preprint arXiv:2306.10950, 2023.
- Online portfolio selection: A survey. ACM Computing Surveys (CSUR), 46(3):35, 2014.
- Pamr: Passive aggressive mean reversion strategy for portfolio selection. Machine Learning, 87(2):221–258, 2012.
- Thomas M Cover. Universal data compression and portfolio selection. In Foundations of Computer Science, 1996. Proceedings., 37th Annual Symposium on, pages 534–538. IEEE, 1996.
- Nonparametric kernel-based sequential investment strategies. Mathematical Finance, 16(2):337–357, 2006.
- Universal portfolio selection. In Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pages 12–23. ACM, 1998.
- Meta optimization and its application to portfolio selection. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’11, page 1163, 2011.
- Deep learning for finance: deep portfolios. Applied Stochastic Models in Business and Industry, 2016.
- Seyed Taghi Akhavan Niaki and Saeid Hoseinzade. Forecasting s&p 500 index using artificial neural networks and design of experiments. Journal of Industrial Engineering International, 9(1):1, 2013.
- Prediction-based portfolio optimization model using neural networks. Neurocomputing, 72(10):2155–2170, 2009.
- Gang Hu. Advancing algorithmic trading: A multi-technique enhancement of deep q-network models. arXiv preprint arXiv:2311.05743, 2023.
- J. Moody and M. Saffell. Learning to trade via direct reinforcement. IEEE Transactions on Neural Networks, 12(4):875–889, 2001.
- M.A.H. Dempster and V. Leemans. An automated fx trading system using adaptive reinforcement learning. Expert Systems with Applications, 30(3):543–552, 2006. Intelligent Information Systems for Financial Engineering.
- James Cumming. An investigation into the use of reinforcement learning techniques within the algorithmic trading domain. Master’s thesis, Imperial College London, 2015.
- Deep direct reinforcement learning for financial signal representation and trading. IEEE transactions on neural networks and learning systems, 28(3):653–664, 2017.
- Deterministic policy gradient algorithms. In Proceedings of the 31st International Conference on Machine Learning (ICML-14), pages 387–395, 2014.
- Summary of chatgpt-related research and perspective towards the future of large language models. Meta-Radiology, 1(2):100017, September 2023.
- Harry M Markowitz. Portfolio selection: efficient diversification of investments, volume 16. Yale University Press, 1968.
- Total return strategies for multi-asset portfolios. Journal of Portfolio Management, 33(2):60, 2007.
- Eugene F Fama. Stock returns, expected returns, and real activity. The journal of finance, 45(4):1089–1108, 1990.
- William F. Sharpe. The sharpe ratio. The Journal of Portfolio Management, 21(1):49–58, 1994.
- Maximum drawdown. Risk Magazine, 17(1):99–102, 2004.
- Sortino: a ‘sharper’ratio. Chicago, Illinois: Red Rock Capital, 2013.
- Stability tests for alphas and betas over bull and bear market conditions. The Journal of Finance, 32(4):1093–1099, 1977.
- Thomas H Goodwin. The information ratio. Financial Analysts Journal, 54(4):34–43, 1998.
- Jaydip Sen. A comparative study on the sharpe ratio, sortino ratio, and calmar ratio in portfolio optimization.
- Peter A Griffin. Different measures of win rate for optimal proportional betting. Management Science, 30(12):1540–1547, 1984.
- The economic value of weather forecasts for decision-making problems in the profit/loss situation. Meteorological Applications: A journal of forecasting, practical applications, training techniques and modelling, 14(4):455–463, 2007.
- Robert J Shiller. Market volatility. MIT press, 1992.