Applying Reinforcement Learning to Option Pricing and Hedging (2310.04336v1)
Abstract: This thesis provides an overview of the recent advances in reinforcement learning in pricing and hedging financial instruments, with a primary focus on a detailed explanation of the Q-Learning Black Scholes approach, introduced by Halperin (2017). This reinforcement learning approach bridges the traditional Black and Scholes (1973) model with novel artificial intelligence algorithms, enabling option pricing and hedging in a completely model-free and data-driven way. This paper also explores the algorithm's performance under different state variables and scenarios for a European put option. The results reveal that the model is an accurate estimator under different levels of volatility and hedging frequency. Moreover, this method exhibits robust performance across various levels of option's moneyness. Lastly, the algorithm incorporates proportional transaction costs, indicating diverse impacts on profit and loss, affected by different statistical properties of the state variables.
- Dynamic programming. Princeton University Press, Princeton, USA.
- Eye of the hurricane: An autobiography. World Scientific, Singapore.
- BIS, 2022a. BIS Quarterly review. https://www.bis.org/publ/qtrpdf/r_qt2212.pdf/. Online; accessed 12-December-2022.
- BIS, 2022b. OTC derivatives statistics at end-June 2022. https://www.bis.org/publ/otc_hy2211.pdf/. Online; accessed 12-December-2022.
- Arbitrage theory in continuous time. Oxford University Press, Oxford, UK.
- The pricing of options and corporate liabilities. Journal of Political Economy 81, 637–654.
- Statistical control by monitoring and adjustment. John Wiley & Sons, Hoboken, USA.
- Deep hedging. Quantitative Finance 19, 1271–1291.
- Deep hedging: Hedging derivatives under generic market frictions using reinforcement learning. Swiss Finance Institute Research Paper 19-80.
- Gamma and vega hedging using deep distributional reinforcement learning. arXiv preprint:2205.05614 .
- Deep hedging of derivatives using reinforcement learning. The Journal of Financial Data Science 3, 10–27.
- Open source cross-sectional asset pricing. Critical Finance Review 11, 207–264.
- Asset pricing: Revised edition. Princeton University Press, Princeton, USA.
- Option pricing: A simplified approach. Journal of Financial Economics 7, 229–263.
- DDV, 2022. Börsenumsätze in derivativen Wertpapieren. https://www.derivateverband.de/DEU/Statistiken/Boersenumsaetze/. Online; accessed 14-December-2022.
- Machine learning in finance. Springer, Cham, Switzerland.
- Deep reinforcement learning for option replication and hedging. The Journal of Financial Data Science 2, 44–57.
- Tree-based batch mode reinforcement learning. Journal of Machine Learning Research 6, 503–556.
- Artificial intelligence and machine learning in finance: Identifying foundations, themes, and research clusters from bibliometric analysis. Journal of Behavioral and Experimental Finance 32, 100577.
- Mainstream science on intelligence: An editorial with 52 signatories, history, and bibliography. Intelligence 24, 13–23.
- Applications of least-squares regressions to pricing and hedging of financial derivatives. Ph.D. thesis. Technische Universität München.
- Man versus machine: On artificial intelligence and hedge funds performance. Applied Economics 54, 4632–4646.
- QLBS: Q-Learner in the Black-Scholes (-Merton) worlds. arXiv preprint:1712.04609 .
- The QLBS Q-Learner goes NuQLear: Fitted Q iteration, inverse RL, and option portfolios. Quantitative Finance 19, 1543–1553.
- Double Q-learning. Advances in Neural Information Processing Systems 23, 2613–2621.
- A closed-form solution for options with stochastic volatility with applications to bond and currency options. The Review of Financial Studies 6, 327–343.
- Replicating anomalies. The Review of Financial Studies 33, 2019–2133.
- The pricing of options on assets with stochastic volatilities. The Journal of Finance 42, 281–300.
- Optimal delta hedging for options. Journal of Banking & Finance 82, 180–190.
- Options, futures, and other derivatives. Prentice Hall, Englewood Cliffs, USA.
- Dynamic replication and hedging: A reinforcement learning approach. The Journal of Financial Data Science 1, 159–171.
- Batch reinforcement learning, in: Reinforcement learning. Springer, pp. 45–73.
- Deep Reinforcement Learning. Das umfassende Praxis-Handbuch: Moderne Algorithmen für Chatbots, Robotik, diskrete Optimierung und Web-Automatisierung inkl. Multiagenten-Methoden. MITP-Verlags GmbH & Co. KG, Frechen, Germany.
- Option pricing and replication with transactions costs. The Journal of Finance 40, 1283–1301.
- Offline reinforcement learning: Tutorial, review, and perspectives on open problems. arXiv preprint: 2005.01643 .
- Valuing american options by simulation: A simple least-squares approach. The Review of Financial Studies 14, 113–147.
- What is artificial intelligence. url: http://jmc.stanford.edu/articles/whatisai/whatisai.pdf.
- Theory of rational option pricing. The Bell Journal of Economics and Management Science 4, 141–183.
- Playing atari with deep reinforcement learning. arXiv preprint:1312.5602 .
- A generalization error for Q-learning. Journal of Machine Learning Research 6, 1073–1097.
- Deep hedging: Continuous reinforcement learning for hedging of general portfolios across multiple risk aversions, in: 3rd ACM International Conference on AI in Finance, pp. 361–368.
- Neural fitted Q iteration – first experiences with a data efficient neural reinforcement learning method, in: European Conference on Machine Learning, Springer. pp. 317–328.
- Artificial Intelligence: A Modern Approach. Prentice Hall, Upper Saddle River, USA.
- Lectures on reinforcement learning. url: https://www.davidsilver.uk/teaching/.
- Mastering the game of go with deep neural networks and tree search. Nature 529, 484–489.
- Mastering the game of go without human knowledge. Nature 550, 354–359.
- Reward is enough. Artificial Intelligence 299, 103535.
- Learning to predict by the methods of temporal differences. Machine Learning 3, 9–44.
- Reinforcement learning: An introduction. MIT press, Cambridge, USA; London, UK.
- On average versus discounted reward temporal-difference learning. Machine Learning 49, 179–191.
- Computing machinery and intelligence. Mind 59, 433–460.
- Learning from delayed rewards. Ph.D. thesis. King’s College.
- Q-learning. Machine Learning 8, 279–292.
- Catecholamine theories of reward: A critical review. Brain Research 152, 215–247.