Can Reinforcement Learning support policy makers? A preliminary study with Integrated Assessment Models (2312.06527v1)
Abstract: Governments around the world aspire to ground decision-making on evidence. Many of the foundations of policy making - e.g. sensing patterns that relate to societal needs, developing evidence-based programs, forecasting potential outcomes of policy changes, and monitoring effectiveness of policy programs - have the potential to benefit from the use of large-scale datasets or simulations together with intelligent algorithms. These could, if designed and deployed in a way that is well grounded on scientific evidence, enable a more comprehensive, faster, and rigorous approach to policy making. Integrated Assessment Models (IAM) is a broad umbrella covering scientific models that attempt to link main features of society and economy with the biosphere into one modelling framework. At present, these systems are probed by policy makers and advisory groups in a hypothesis-driven manner. In this paper, we empirically demonstrate that modern Reinforcement Learning can be used to probe IAMs and explore the space of solutions in a more principled manner. While the implication of our results are modest since the environment is simplistic, we believe that this is a stepping stone towards more ambitious use cases, which could allow for effective exploration of policies and understanding of their consequences and limitations.
- Determinants of emissions pathways in the coupled climate–social system. Nature 2022 603:7899, 603(7899):103–111, 2 2022. ISSN 1476-4687. doi: 10.1038/s41586-022-04423-8. URL https://www.nature.com/articles/s41586-022-04423-8.
- Integrated assessment models of global climate change. Annual Review of Energy and the Environment, 22(1):589–628, 1997.
- Climate change 2022: Impacts, adaptation and vulnerability. IPCC Geneva, Switzerland:, 2022.
- The failure of integrated assessment models as a response to ‘climate emergency’and ecological breakdown: the emperor has no clothes. Globalizations, 18(7):1178–1188, 2021.
- From lakes and glades to viability algorithms: Automatic classification of system states according to the Topology of Sustainable Management. European Physical Journal: Special Topics, 230(14-15):3133–3152, 6 2017. ISSN 19516401. doi: 10.48550/arxiv.1706.04542. URL https://arxiv.org/abs/1706.04542v4.
- Sustainability, collapse and oscillations in a simple World-Earth model. Environmental Research Letters, 12(7):074020, 7 2017. ISSN 1748-9326. doi: 10.1088/1748-9326/AA7581. URL https://iopscience.iop.org/article/10.1088/1748-9326/aa7581https://iopscience.iop.org/article/10.1088/1748-9326/aa7581/meta.
- Deep reinforcement learning in World-Earth system models to discover sustainable management strategies. Chaos: An Interdisciplinary Journal of Nonlinear Science, 29(12):123122, 12 2019. ISSN 1054-1500. doi: 10.1063/1.5124673. URL https://aip.scitation.org/doi/abs/10.1063/1.5124673.
- Reinforcement Learning: An Introduction. 2 edition, 2020. URL http://incompleteideas.net/book/the-book.html.
- A safe operating space for humanity. Nature 2009 461:7263, 461(7263):472–475, 9 2009. ISSN 1476-4687. doi: 10.1038/461472a. URL https://www.nature.com/articles/461472a.
- Planetary boundaries: Guiding human development on a changing planet. Science, 347(6223), 2 2015. ISSN 10959203. doi: 10.1126/SCIENCE.1259855. URL http://dx.doi.
- 1.5 c degrowth scenarios suggest the need for new mitigation pathways. Nature communications, 12(1):2676, 2021.
- Asynchronous Methods for Deep Reinforcement Learning. 33rd International Conference on Machine Learning, ICML 2016, 4:2850–2869, 2 2016. URL https://arxiv.org/abs/1602.01783v2.
- Proximal policy optimization algorithms. ArXiv, abs/1707.06347, 2017.
- Human-level control through deep reinforcement learning. Nature 2015 518:7540, 518(7540):529–533, 2 2015. ISSN 1476-4687. doi: 10.1038/nature14236. URL https://www.nature.com/articles/nature14236.
- Rainbow: Combining Improvements in Deep Reinforcement Learning. 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pages 3215–3222, 10 2017. ISSN 2159-5399. doi: 10.48550/arxiv.1710.02298. URL https://arxiv.org/abs/1710.02298v1.
- Understanding the impact of entropy on policy optimization. In International conference on machine learning, pages 151–160. PMLR, 2019.
- On the global convergence rates of softmax policy gradient methods. In International Conference on Machine Learning, pages 6820–6829. PMLR, 2020.
- Exploration by random network distillation. arXiv preprint arXiv:1810.12894, 2018.
- Michael C. MacCracken. Prospects for future climate change and the reasons for early action. Journal of the Air & Waste Management Association, 58(6):735–786, 2008. doi: 10.3155/1047-3289.58.6.735. URL https://doi.org/10.3155/1047-3289.58.6.735.
- Climate urgency and the timing of carbon fluxes. Biomass and Bioenergy, 151:106162, 2021. ISSN 0961-9534. doi: https://doi.org/10.1016/j.biombioe.2021.106162. URL https://www.sciencedirect.com/science/article/pii/S0961953421001987.
- Charting pathways to climate change mitigation in a coupled socio-climate model. PLOS Computational Biology, 15(6):e1007000, 6 2019. ISSN 1553-7358. doi: 10.1371/JOURNAL.PCBI.1007000. URL https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007000.
- Dueling Network Architectures for Deep Reinforcement Learning. 33rd International Conference on Machine Learning, ICML 2016, 4:2939–2947, 11 2015. doi: 10.48550/arxiv.1511.06581. URL https://arxiv.org/abs/1511.06581v3.
- A Unified Approach to Interpreting Model Predictions. In I Guyon, U Von Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings.neurips.cc/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf.
- Explainable Reinforcement Learning via Reward Decomposition. 2019. URL https://web.engr.oregonstate.edu/~erwig/papers/ExplainableRL_XAI19.pdf.
- Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences. 7 2018. doi: 10.48550/arxiv.1807.08706. URL https://arxiv.org/abs/1807.08706v1.