Dynamic Capital Requirements for Markov Decision Processes
Abstract: We build on the theory of capital requirements (CRs) to create a new framework for modeling dynamic risk preferences. The key question is how to evaluate the risk of a payoff stream sequentially as new information is revealed. In our model, we associate each payoff stream with a disbursement strategy and a premium schedule to form a triple of stochastic processes. We characterize risk preferences in terms of a single set that we call the risk frontier which characterizes acceptable triples. We then propose the generalized capital requirement (GCR) which evaluates the risk of a payoff stream by minimizing the premium schedule over acceptable triples. We apply this model to a risk-aware decision maker (DM) who controls a Markov decision process (MDP) and wants to find a policy to minimize the GCR of its payoff stream. The resulting GCR-MDP recovers many well-known risk-aware MDPs as special cases. To make this approach computationally viable, we obtain the temporal decomposition of the GCR in terms of the risk frontier. Then, we connect the temporal decomposition with the notion of an information state to compactly capture the dependence of DM's risk preferences on the problem history, where augmented dynamic programming can be used to compute an optimal policy. We report numerical experiments for the GCR-minimizing newsvendor.
- Coherent measures of risk. Mathematical finance, 9(3):203–228, 1999.
- Coherent multiperiod risk adjusted values and bellman’s principle. Annals of Operations Research, 152(1):5–22, 2007.
- Dynamic mean-variance asset allocation. The Review of Financial Studies, 23(8):2970–3016, 2010.
- Minimizing spectral risk measures applied to Markov decision processes. arXiv preprint arXiv:2012.04521, 2020.
- Markov decision processes with average-value-at-risk criteria. Mathematical Methods of Operations Research, 74(3):361–379, 2011.
- More risk-sensitive Markov decision processes. Mathematics of Operations Research, 39(1):105–120, 2013.
- Dimitri P Bertsekas. Dynamic Programming and Optimal Control, volume 2. Athena Scientific, 4th edition, 2012.
- A unified approach to time consistency of dynamic risk measures and dynamic performance measures in discrete time. Mathematics of Operations Research, 43(1):204–221, 2018.
- Dynamic coherent acceptability indices and their applications to finance. Mathematical Finance, 24(3):411–441, 2014.
- A theory of Markovian time-inconsistent stochastic control in discrete time. Finance and Stochastics, 18(3):545–592, 2014.
- Mean–variance portfolio optimization with state-dependent risk aversion. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 24(1):1–24, 2014.
- Clearinghouse margin requirements. Operations Research, 66(6):1542–1558, 2018.
- Dynamic consistency for stochastic optimal control problems. Annals of Operations Research, 200(1):247–263, 2012.
- On dynamic decision making to meet consumption targets. Operations Research, 63(5):1117–1130, 2015.
- Dynamic monetary risk measures for bounded discrete-time processes. Electronic Journal of Probability, 11:57–106, 2006.
- Risk-sensitive and robust decision-making: a cvar optimization approach. arXiv preprint arXiv:1506.02188, 2015.
- Discounted MDP’s: Distribution functions and exponential utility maximization. SIAM journal on control and optimization, 25(1):49–62, 1987.
- Percentile optimization for Markov decision processes with parameter uncertainty. Operations Research, 58(1):203–213, 2010.
- Conditional and dynamic convex risk measures. Finance and stochastics, 9(4):539–561, 2005.
- Variance-penalized Markov decision processes. Mathematics of Operations Research, 14(1):147–161, 1989.
- Risk measures and capital requirements for processes. Mathematical finance, 16(4):589–612, 2006.
- A convex analytic approach to risk-aware Markov decision processes. SIAM Journal on Control and Optimization, 53(3):1569–1598, 2015.
- Mean-variance portfolio selection with dynamic targets for expected terminal wealth. Mathematics of Operations Research, 47(1):587–615, 2022.
- Risk-sensitive Markov decision processes. Management Science, 18(7):356–369, 1972.
- Risk-averse approximate dynamic programming with quantile-based risk measures. Mathematics of Operations Research, 43(2):554–579, 2018.
- Valuations and dynamic convex risk measures. Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 18(1):1–22, 2008.
- Time consistency of the mean-risk problem. Operations research, 69(4):1100–1117, 2021.
- David M Kreps. Decision problems with expected utility criteria, ii: Stationarity. Mathematics of Operations Research, 2(3):266–274, 1977.
- Time-varying risk aversion and dynamic portfolio allocation. Operations Research, 70(1):23–37, 2022.
- Quantile Markov decision process. arXiv preprint arXiv:1711.05788, 2017.
- Robust MDPs with k-rectangular uncertainty. Mathematics of Operations Research, 41(4):1484–1509, 2016.
- Algorithmic aspects of mean–variance optimization in Markov decision processes. European Journal of Operational Research, 231(3):645–653, 2013.
- Peter Meyer-Nieberg. Banach Lattices. Springer Science & Business Media, 1991.
- Theory of games and economic behavior. Princeton university press, 1953.
- Takayuki Osogami. Iterated risk measures for risk-sensitive Markov decision processes with discounted cost. In Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence (UAI 2011), pages 567–574, 2011.
- Time-consistent decisions and temporal decomposition of coherent risk functionals. Mathematics of Operations Research, 41(2):682–699, 2016.
- Time-inconsistent multistage stochastic programs: Martingale bounds. European Journal of Operational Research, 249(1):155–163, 2016.
- Risk measures for income streams. Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät, 2001.
- Measuring risk for income streams. Computational Optimization and Applications, 32(1):161–178, 2005.
- Martin L Puterman. Markov decision processes: discrete stochastic dynamic programming. John Wiley & Sons, 2014.
- Andrzej Ruszczyński. Risk-averse dynamic programming for Markov decision processes. Mathematical programming, 125(2):235–261, 2010.
- Andrzej Ruszczyński. Erratum to: Risk-averse dynamic programming for Markov decision processes. Mathematical Programming, 145(1):601–604, 2014.
- Conditional risk mappings. Mathematics of operations research, 31(3):544–561, 2006.
- Giacomo Scandolo. Models of capital requirements in static and dynamic settings. Economic Notes, 33(3):415–435, 2004.
- Alexander Shapiro. On a time consistency concept in risk averse multistage stochastic programming. Operations Research Letters, 37(3):143–147, 2009.
- Better than pre-committed optimal mean-variance policy in a jump diffusion market. Mathematical Methods of Operations Research, 85(3):327–347, 2017.
- Matthew J. Sobel. Mean-variance tradeoffs in an undiscounted MDP. Operations Research, 42(1):175–183, 1994.
- Approximate information state for approximate planning and reinforcement learning in partially observed systems. J. Mach. Learn. Res., 23:12–1, 2022.
- Sequential decision making with coherent risk. IEEE Transactions on Automatic Control, 62(7):3323–3338, 2016.
- Sina Tutsch. Update rules for convex risk measures. Quantitative Finance, 8(8):833–843, 2008.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.