Advancing Investment Frontiers: Industry-grade Deep Reinforcement Learning for Portfolio Optimization (2403.07916v1)
Abstract: This research paper delves into the application of Deep Reinforcement Learning (DRL) in asset-class agnostic portfolio optimization, integrating industry-grade methodologies with quantitative finance. At the heart of this integration is our robust framework that not only merges advanced DRL algorithms with modern computational techniques but also emphasizes stringent statistical analysis, software engineering and regulatory compliance. To the best of our knowledge, this is the first study integrating financial Reinforcement Learning with sim-to-real methodologies from robotics and mathematical physics, thus enriching our frameworks and arguments with this unique perspective. Our research culminates with the introduction of AlphaOptimizerNet, a proprietary Reinforcement Learning agent (and corresponding library). Developed from a synthesis of state-of-the-art (SOTA) literature and our unique interdisciplinary methodology, AlphaOptimizerNet demonstrates encouraging risk-return optimization across various asset classes with realistic constraints. These preliminary results underscore the practical efficacy of our frameworks. As the finance sector increasingly gravitates towards advanced algorithmic solutions, our study bridges theoretical advancements with real-world applicability, offering a template for ensuring safety and robust standards in this technologically driven future.
- Philip Ndikum “Machine learning algorithms for financial asset price forecasting” In arXiv preprint arXiv:2004.01504, 2020
- Paul Wilmott “Where quants go wrong: a dozen basic lessons in commonsense for quants and risk managers and the traders who rely on them” In Wilmott Journal 1.1 Wiley Online Library, 2009, pp. 1–22
- David H Bailey and Marcos Prado “Finance is Not Excused: Why Finance Should Not Flout Basic Principles of Statistics” In Forthcoming, Significance (Royal Statistical Society), 2021
- Humphrey K. K. Tung and Michael C. S. Wong “Financial Risk Forecasting with Non-Stationarity” In Financial Risk Forecasting Palgrave Macmillan UK, 2011
- Thomas Guhr “Non-stationarity in Financial Markets: Dynamics of Market States Versus Generic Features” In Acta Physica Polonica B 46, 2015, pp. 1625
- Mazin AM Al Janabi “Optimization algorithms and investment portfolio analytics with machine learning techniques under time-varying liquidity constraints” In Journal of Modelling in Management Emerald Publishing Limited, 2021
- Marcos López De Prado “The 10 reasons most machine learning funds fail” In The Journal of Portfolio Management 44.6 Institutional Investor Journals Umbrella, 2018, pp. 120–133
- Adrian Millea “Deep reinforcement learning for trading—A critical survey” In Data 6.11 Multidisciplinary Digital Publishing Institute, 2021, pp. 119
- Leif Andersen “Regulation, Capital, and Margining: Quant Angle” Bank of America Merrill Lynch, 2014
- Christos Makridis and Alberto G Rossi “Rise of the’Quants’ in Financial Services: Regulation and Crowding Out of Routine Jobs” In Available at SSRN 3218031, 2018
- “Overview on deepmind and its alphago zero ai” In Proceedings of the 2018 international conference on big data and education, 2018, pp. 67–71
- “Deep reinforcement learning from self-play in imperfect-information games” In arXiv preprint arXiv:1603.01121, 2016
- “No-press diplomacy: Modeling multi-agent gameplay” In Advances in Neural Information Processing Systems 32, 2019
- “Combining deep reinforcement learning and search for imperfect-information games” In Advances in Neural Information Processing Systems 33, 2020, pp. 17057–17069
- “Solving imperfect information poker games using Monte Carlo search and POMDP models” In 2020 IEEE 9th Data Driven Control and Learning Systems Conference (DDCLS), 2020, pp. 1060–1065 IEEE
- “No-press diplomacy from scratch” In Advances in Neural Information Processing Systems 34, 2021, pp. 18063–18074
- “Deep reinforcement learning for autonomous driving: A survey” In IEEE Transactions on Intelligent Transportation Systems 23.6 IEEE, 2021, pp. 4909–4926
- “Deep learning and reinforcement learning for autonomous unmanned aerial systems: Roadmap for theory to deployment” In Deep Learning for Unmanned Systems Springer, 2021, pp. 25–82
- “Robust flight navigation out of distribution with liquid neural networks” In Science Robotics 8.77 American Association for the Advancement of Science, 2023, pp. eadc8892
- “Ride-hailing order dispatching at didi via reinforcement learning” In INFORMS Journal on Applied Analytics 50.5 INFORMS, 2020, pp. 272–286
- Terrence J Sejnowski “Large language models and the reverse turing test” In Neural computation 35.3 MIT Press, 2023, pp. 309–342
- “TURINGBENCH: A benchmark environment for Turing test in the age of neural text generation” In arXiv preprint arXiv:2109.13296, 2021
- “Instruction tuning with gpt-4” In arXiv preprint arXiv:2304.03277, 2023
- “Bloomberggpt: A large language model for finance” In arXiv preprint arXiv:2303.17564, 2023
- “FinRL: A deep reinforcement learning library for automated stock trading in quantitative finance” In arXiv preprint arXiv:2011.09607, 2020
- “FinRL-Meta: Market environments and benchmarks for data-driven financial reinforcement learning” In Advances in Neural Information Processing Systems 35, 2022, pp. 1835–1849
- “FinRL-Podracer: High performance and scalable deep reinforcement learning for quantitative finance” In Proceedings of the Second ACM International Conference on AI in Finance, 2021, pp. 1–9
- Zitao Song, Xuyang Jin and Chenliang Li “Safe-FinRL: A Low Bias and Variance Deep Reinforcement Learning Implementation for High-Freq Stock Trading” In arXiv preprint arXiv:2206.05910, 2022
- “Stable-Baselines3: Reliable Reinforcement Learning Implementations” In Journal of Machine Learning Research 22.268, 2021, pp. 1–8 URL: http://jmlr.org/papers/v22/20-1364.html
- “Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms” In Journal of Machine Learning Research 23.274, 2022, pp. 1–18
- “Gymnasium” Zenodo, 2023 DOI: 10.5281/zenodo.8127026
- “Envpool: A highly parallel reinforcement learning environment execution engine” In Advances in Neural Information Processing Systems 35, 2022, pp. 22409–22421
- “Towards designing a generic and comprehensive deep reinforcement learning framework” In Applied Intelligence 53.3 Springer, 2023, pp. 2967–2988
- “The Shift from Models to Compound AI Systems”, https://bair.berkeley.edu/blog/2024/02/18/compound-ai-systems/, 2024
- Theis Ingerslev Jensen, Bryan Kelly and Lasse Heje Pedersen “Is there a replication crisis in finance?” In The Journal of Finance 78.5 Wiley Online Library, 2023, pp. 2465–2518
- Theis Ingerslev Jensen, Bryan T Kelly and Lasse Heje Pedersen “Is there a replication crisis in finance?”, 2021
- Kewei Hou, Chen Xue and Lu Zhang “Replicating anomalies” In The Review of financial studies 33.5 Oxford University Press, 2020, pp. 2019–2133
- “Open source, open science, and the replication crisis in HCI” In Extended abstracts of the 2018 CHI conference on human factors in computing systems, 2018, pp. 1–8
- Matthew Hutson “Artificial intelligence faces reproducibility crisis” American Association for the Advancement of Science, 2018
- Elizabeth Gibney “Is AI fuelling a reproducibility crisis in science” In Nature 608, 2022, pp. 250–251
- “Empirical software engineering” In Handbook of Software Engineering Springer, 2019, pp. 285–320
- “The worst of both worlds: A comparative analysis of errors in learning from data in psychology and machine learning” In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 2022, pp. 335–348
- Christopher Tong “Statistical inference enables bad science; statistical thinking enables good science” In The American Statistician 73.sup1 Taylor & Francis, 2019, pp. 246–261
- Roger Mead, Steven G Gilmour and Andrew Mead “Statistical principles for the design of experiments: applications to real experiments” Cambridge University Press, 2012
- Michael Parkinson and Carlos Oscar Sánchez Sorzano “Why Do We Need a Statistical Experiment Design?” In Experimental Design and Reproducibility in Preclinical Animal Studies Springer, 2021, pp. 129–146
- “Statistical Design of Experiments (DoE)” In Statistics for Engineers: An Introduction with Examples from Practice Springer, 2021, pp. 1–20
- “Design and analysis of computer experiments with quantitative and qualitative inputs: A selective review” In Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 10.3 Wiley Online Library, 2020, pp. e1358
- “A statistical design of experiments approach to machine learning model selection in engineering applications” In Journal of Computing and Information Science in Engineering 21.1 American Society of Mechanical Engineers, 2021, pp. 011008
- Frank J Fabozzi, Francis Gupta and Harry M Markowitz “The legacy of modern portfolio theory” In The journal of investing 11.3 Institutional Investor Journals Umbrella, 2002, pp. 7–22
- Warren B Powell “Reinforcement Learning and Stochastic Optimization: A unified framework for sequential decisions” John Wiley & Sons, 2022
- Zvi Bodie, Alex Kane and Alan Marcus “Investments: Global Edition” McGraw Hill, 2020
- Harry M Markowitz and Harry M Markowitz “Portfolio selection: efficient diversification of investments” J. Wiley, 1967
- Franco Modigliani and Merton H Miller “The cost of capital, corporation finance and the theory of investment” In The American economic review 48.3 JSTOR, 1958, pp. 261–297
- William F Sharpe “Capital asset prices: A theory of market equilibrium under conditions of risk” In The journal of finance 19.3 Wiley Online Library, 1964, pp. 425–442
- Harry M Markowitz “Foundations of portfolio theory” In The journal of finance 46.2 JSTOR, 1991, pp. 469–477
- “Heuristics for cardinality constrained portfolio optimisation” In Computers & Operations Research 27.13 Elsevier, 2000, pp. 1271–1302
- Francesco Cesarone, Andrea Scozzari and Fabio Tardella “A new method for mean-variance portfolio optimization with cardinality constraints” In Annals of Operations Research 205 Springer, 2013, pp. 213–234
- “Recent advances in quadratic programming algorithms for nonlinear model predictive control” In Vietnam Journal of Mathematics 46.4 Springer, 2018, pp. 863–882
- Markus Hirschberger, Yue Qi and Ralph E Steuer “Large-scale MV efficient frontier computation via a procedure of parametric quadratic programming” In European Journal of Operational Research 204.3 Elsevier, 2010, pp. 581–588
- “Multi-period trading via convex optimization” In Foundations and Trends® in Optimization 3.1 Now Publishers, Inc., 2017, pp. 1–76
- “Investor’s behaviour and the relevance of asymmetric risk measures” In Banks & bank systems, 2012, pp. 87–94
- “Post-modern approaches for portfolio optimization” In Handbook on information technology in finance Springer, 2008, pp. 613–634
- Edmond Lezmi, Thierry Roncalli and Jiali Xu “Multi-Period Portfolio Optimization” In Available at SSRN, 2022
- Petter N Kolm, Reha Tütüncü and Frank J Fabozzi “60 Years of portfolio optimization: Practical challenges and current trends” In European Journal of Operational Research 234.2 Elsevier, 2014, pp. 356–371
- Jean-Philippe Bouchaud “Economics needs a scientific revolution” In Nature 455.7217 Nature Publishing Group UK London, 2008, pp. 1181–1181
- “Risk measurement in post-modern portfolio theory: differences from modern portfolio theory” In Economic Computation & Economic Cybernetics Studies & Research 47.1, 2013, pp. 113–132
- Lionel Martellini “Toward the design of better equity benchmarks: Rehabilitating the tangency portfolio from modern portfolio theory” In The Journal of Portfolio Management 34.4 Institutional Investor Journals Umbrella, 2008, pp. 34–41
- Alexander Kempf, Olaf Korn and Sven Saßning “Portfolio optimization using forward-looking information” In Review of Finance 19.1 Oxford University Press, 2015, pp. 467–490
- Daniele D’Alvia “Uncertainty: The Necessary Unknowable Road to Speculation” In The Speculator of Financial Markets: How Financial Innovation and Supervision Made the Modern World Cham: Springer International Publishing, 2023, pp. 119–169
- Erdinç Akyıldırım and Halil Mete Soner “A brief history of mathematics in finance” In Borsa Istanbul Review 14.1 Elsevier, 2014, pp. 57–63
- “Brain-inspired learning in artificial neural networks: a review” In arXiv preprint arXiv:2305.11252, 2023
- Fahad Sarfraz, Elahe Arani and Bahram Zonooz “A Study of Biologically Plausible Neural Network: The Role and Interactions of Brain-Inspired Mechanisms in Continual Learning” In arXiv preprint arXiv:2304.06738, 2023
- Karl Johan Åström “Optimal control of Markov processes with incomplete state information” In Journal of mathematical analysis and applications 10.1 Academic Press, 1965, pp. 174–205
- Michael L Littman “A tutorial on partially observable Markov decision processes” In Journal of Mathematical Psychology 53.3 Elsevier, 2009, pp. 119–125
- Richard Bellman “A Markovian decision process” In Journal of mathematics and mechanics JSTOR, 1957, pp. 679–684
- “A review of deep learning for renewable energy forecasting” In Energy Conversion and Management 198 Elsevier, 2019, pp. 111799
- Alejandro J Real, Fernando Dorado and Jaime Durán “Energy demand forecasting using deep learning: Applications for the French grid” In Energies 13.9 MDPI, 2020, pp. 2242
- “A deep reinforcement learning framework for continuous intraday market bidding” In Machine Learning 110 Springer, 2021, pp. 2335–2387
- Yuanhang Zheng, Zeshui Xu and Anran Xiao “Deep learning in economics: a systematic and critical review” In Artificial Intelligence Review Springer, 2023, pp. 1–43
- “Deep learning for exotic option valuation” In The Journal of Financial Data Science Institutional Investor Journals Umbrella, 2021
- Anthony L Caterini and Dong Eui Chang “Generic Representation of Neural Networks” In Deep Neural Networks in a Mathematical Framework Springer, 2018, pp. 23–34
- “Scientific Machine Learning through Physics-Informed Neural Networks: Where we are and What’s next” In arXiv preprint arXiv:2201.05624, 2022
- James Owen Weatherall “The physics of wall street: a brief history of predicting the unpredictable” Houghton Mifflin Harcourt, 2013
- “Large-scale recommender systems and the netflix prize competition” In KDD Proceedings, 2008, pp. 1–34
- Robert M Bell and Yehuda Koren “Lessons from the Netflix prize challenge” In Acm Sigkdd Explorations Newsletter 9.2 ACM New York, NY, USA, 2007, pp. 75–79
- “Deep learning for recommender systems: A Netflix case study” In AI Magazine 42.3, 2021, pp. 7–18
- “Deep matrix factorization models for recommender systems.” In IJCAI 17, 2017, pp. 3203–3209 Melbourne, Australia
- “Deep Reinforcement Learning Applied to a Sparse-Reward Trading Environment with Intraday Data” In Available at SSRN 4411793
- “A survey of graph neural networks for recommender systems: Challenges, methods, and directions” In ACM Transactions on Recommender Systems 1.1 ACM New York, NY, USA, 2023, pp. 1–51
- “Pearl: A Production-ready Reinforcement Learning Agent” In arXiv preprint arXiv:2312.03814, 2023
- “Off-policy evaluation in infinite-horizon reinforcement learning with latent confounders” In International Conference on Artificial Intelligence and Statistics, 2021, pp. 1999–2007 PMLR
- “A review on matrix factorization techniques in recommender systems” In 2017 2nd International Conference on Communication Systems, Computing and IT Applications (CSCITA), 2017, pp. 269–274 IEEE
- “Deep learning based recommender system: A survey and new perspectives” In ACM computing surveys (CSUR) 52.1 ACM New York, NY, USA, 2019, pp. 1–38
- M Mehdi Afsar, Trafford Crump and Behrouz Far “Reinforcement learning based recommender systems: A survey” In ACM Computing Surveys 55.7 ACM New York, NY, 2022, pp. 1–38
- “A survey on reinforcement learning for recommender systems” In IEEE Transactions on Neural Networks and Learning Systems IEEE, 2023
- Saud Almahdi and Steve Y Yang “An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown” In Expert Systems with Applications 87 Elsevier, 2017, pp. 267–279
- Yoshiharu Sato “Model-free reinforcement learning for financial portfolios: a brief survey” In arXiv preprint arXiv:1904.04973, 2019
- Amine Mohamed Aboussalah and Chi-Guhn Lee “Continuous control with stacked deep dynamic recurrent reinforcement learning for portfolio optimization” In Expert Systems with Applications 140 Elsevier, 2020, pp. 112891
- Hui Niu, Siyuan Li and Jian Li “MetaTrader: An reinforcement learning approach integrating diverse policies for portfolio optimization” In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 1573–1583
- Zihao Zhang, Stefan Zohren and Roberts Stephen “Deep reinforcement learning for trading” In The Journal of Financial Data Science Institutional Investor Journals Umbrella, 2020
- “Reinforcement Learning Applied to Trading Systems: A Survey” In arXiv preprint arXiv:2212.06064, 2022
- Gordon Ritter “Machine learning for trading” In Available at SSRN 3015609, 2017
- Bruno Biais, Larry Glosten and Chester Spatt “Market microstructure: A survey of microfoundations, empirical results, and policy implications” In Journal of Financial Markets 8.2 Elsevier, 2005, pp. 217–264
- “Microstructure in the machine age” In The Review of Financial Studies 34.7 Oxford University Press, 2021, pp. 3316–3363
- Bastien Baldacci “Quantitative finance at the microstructure scale: algorithmic trading and regulation”, 2021
- Tobias Galla and J Doyne Farmer “Complex dynamics in learning complicated games” In Proceedings of the National Academy of Sciences 110.4 National Acad Sciences, 2013, pp. 1232–1236
- Reidar B Bratvold and Frank Koch “Game Theory in the Oil and Gas Industry” In The Way Ahead 7.01 SPE, 2011, pp. 18–20
- Vivek K Pandey and Chen Y Wu “Investors May Take Heart: A Game Theoretic View of High Frequency Trading” In Journal of Financial Planning 28.5, 2015, pp. 53–57
- “Human-in-the-Loop Learning Methods Toward Safe DL-Based Autonomous Systems: A Review” In Computer Safety, Reliability, and Security. SAFECOMP 2021 Workshops: DECSoS, MAPSOD, DepDevOps, USDAI, and WAISE, York, UK, September 7, 2021, Proceedings 40, 2021, pp. 251–264 Springer
- Matthew E Taylor “Reinforcement Learning Requires Human-in-the-Loop Framing and Approaches.” In HHAI, 2023, pp. 351–360
- Alvaro HC Correia and Freddy Lecue “Human-in-the-loop feature selection” In Proceedings of the AAAI Conference on Artificial Intelligence 33.01, 2019, pp. 2438–2445
- “Human-in-the-loop deep reinforcement learning with application to autonomous driving” In arXiv preprint arXiv:2104.07246, 2021
- “Perspectives on sim2real transfer for robotics: A summary of the r: Ss 2020 workshop” In arXiv preprint arXiv:2012.03806, 2020
- “i-sim2real: Reinforcement learning of robotic policies in tight human-robot interaction loops” In Conference on Robot Learning, 2023, pp. 212–224 PMLR
- “Understanding domain randomization for sim-to-real transfer” In arXiv preprint arXiv:2110.03239, 2021
- Wenshuai Zhao, Jorge Peña Queralta and Tomi Westerlund “Sim-to-real transfer in deep reinforcement learning for robotics: a survey” In 2020 IEEE symposium series on computational intelligence (SSCI), 2020, pp. 737–744 IEEE
- “Sim-to-real robot learning from pixels with progressive nets” In Conference on robot learning, 2017, pp. 262–270 PMLR
- “Provable sim-to-real transfer in continuous domain with partial observations” In arXiv preprint arXiv:2210.15598, 2022
- Marc G Bellemare, Will Dabney and Mark Rowland “Distributional reinforcement learning” MIT Press, 2023
- “A review of uncertainty quantification in deep learning: Techniques, applications and challenges” In Information fusion 76 Elsevier, 2021, pp. 243–297
- “A review of uncertainty for deep reinforcement learning” In Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment 18.1, 2022, pp. 155–162
- Yi Zhu, Jing Dong and Henry Lam “Uncertainty Quantification and Exploration for Reinforcement Learning” In Operations Research INFORMS, 2023
- Gianluca Bianchin, Yin-Chen Liu and Fabio Pasqualetti “Secure navigation of robots in adversarial environments” In IEEE Control Systems Letters 4.1 IEEE, 2019, pp. 1–6
- “Multi-robot coordination and planning in uncertain and adversarial environments” In Current Robotics Reports 2 Springer, 2021, pp. 147–157
- “Domain randomization for transferring deep neural networks from simulation to the real world” In 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), 2017, pp. 23–30 IEEE
- “High-speed collision avoidance using deep reinforcement learning and domain randomization for autonomous vehicles” In 2020 IEEE 23rd international conference on Intelligent Transportation Systems (ITSC), 2020, pp. 1–8 IEEE
- “How to pick the domain randomization parameters for sim-to-real transfer of reinforcement learning policies?” In arXiv preprint arXiv:1903.11774, 2019
- “Network randomization: A simple technique for generalization in deep reinforcement learning” In arXiv preprint arXiv:1910.05396, 2019
- “Offline reinforcement learning: Fundamental barriers for value function approximation” In arXiv preprint arXiv:2111.10919, 2021
- “Policy finetuning: Bridging sample-efficient offline and online reinforcement learning” In Advances in neural information processing systems 34, 2021, pp. 27395–27407
- “The role of coverage in online reinforcement learning” In arXiv preprint arXiv:2210.04157, 2022
- “Explainable ai and reinforcement learning—a systematic review of current approaches and trends” In Frontiers in artificial intelligence 4 Frontiers Media SA, 2021, pp. 550030
- “Explainable reinforcement learning through a causal lens” In Proceedings of the AAAI conference on artificial intelligence 34.03, 2020, pp. 2493–2500
- “Redefining Counterfactual Explanations for Reinforcement Learning: Overview, Challenges and Opportunities” In ACM Computing Surveys ACM New York, NY, 2024
- Petar Kormushev, Sylvain Calinon and Darwin G Caldwell “Reinforcement learning in robotics: Applications and real-world challenges” In Robotics 2.3 MDPI, 2013, pp. 122–148
- “How to train your robot with deep reinforcement learning: lessons we have learned” In The International Journal of Robotics Research 40.4-5 SAGE Publications Sage UK: London, England, 2021, pp. 698–721
- “Challenges of real-world reinforcement learning: definitions, benchmarks and analysis” In Machine Learning 110.9 Springer, 2021, pp. 2419–2468
- “The ingredients of real-world robotic reinforcement learning” In arXiv preprint arXiv:2004.12570, 2020
- “End-to-end robotic reinforcement learning without reward engineering” In arXiv preprint arXiv:1904.07854, 2019
- “Scaling data-driven robotics with reward sketching and batch reinforcement learning” In arXiv preprint arXiv:1909.12200, 2019
- “Unpacking reward shaping: Understanding the benefits of reward engineering on sample complexity” In Advances in Neural Information Processing Systems 35, 2022, pp. 15281–15295
- “Inverse reward design” In Advances in neural information processing systems 30, 2017
- “Never stop learning: The effectiveness of fine-tuning in robotic reinforcement learning” In arXiv preprint arXiv:2004.10190, 2020
- “Safe learning in robotics: From learning-based control to safe reinforcement learning” In Annual Review of Control, Robotics, and Autonomous Systems 5 Annual Reviews, 2022, pp. 411–444
- Bryan Lim, Stefan Zohren and Stephen Roberts “Enhancing time-series momentum strategies using deep neural networks” In The Journal of Financial Data Science Institutional Investor Journals Umbrella, 2019
- “Performance functions and reinforcement learning for trading systems and portfolios” In Journal of forecasting 17.5-6 Wiley Online Library, 1998, pp. 441–470
- Zhiyu Zhang, David Bombara and Heng Yang “Discounted Adaptive Online Prediction” In arXiv preprint arXiv:2402.02720, 2024
- “DeepTrader: a deep reinforcement learning approach for risk-return balanced portfolio management with market conditions Embedding” In Proceedings of the AAAI conference on artificial intelligence 35.1, 2021, pp. 643–650
- Zhengyao Jiang, Dixing Xu and Jinjun Liang “A deep reinforcement learning framework for the financial portfolio management problem” In arXiv preprint arXiv:1706.10059, 2017
- “Model-based deep reinforcement learning for dynamic portfolio optimization” In arXiv preprint arXiv:1901.08740, 2019
- “Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization” In FinPlan 2023, 2023, pp. 21