A Survey of Contextual Optimization Methods for Decision Making under Uncertainty (2306.10374v2)
Abstract: Recently there has been a surge of interest in operations research (OR) and the ML community in combining prediction algorithms and optimization techniques to solve decision-making problems in the face of uncertainty. This gave rise to the field of contextual optimization, under which data-driven procedures are developed to prescribe actions to the decision-maker that make the best use of the most recently updated information. A large variety of models and methods have been presented in both OR and ML literature under a variety of names, including data-driven optimization, prescriptive optimization, predictive stochastic programming, policy optimization, (smart) predict/estimate-then-optimize, decision-focused learning, (task-based) end-to-end learning/forecasting/optimization, etc. Focusing on single and two-stage stochastic programming problems, this review article identifies three main frameworks for learning policies from data and discusses their strengths and limitations. We present the existing models and methods under a uniform notation and terminology and classify them according to the three main frameworks identified. Our objective with this survey is to both strengthen the general understanding of this active field of research and stimulate further theoretical and algorithmic advancements in integrating ML and stochastic programming.
- Aronszajn N (1950) Theory of reproducing kernels. Transactions of the American mathematical society 68(3):337–404.
- Ban GY, Rudin C (2019) The Big Data Newsvendor: Practical Insights from Machine Learning. Operations Research 67(1):90–108.
- Bazier-Matte T, Delage E (2020) Generalization bounds for regularized portfolio selection with market side information. INFOR: Information Systems and Operational Research 58(2):374–401.
- Bengio Y (1997) Using a financial training criterion rather than a prediction criterion. International Journal of Neural Systems 8(4):433–443.
- Bertsimas D, Kallus N (2020) From predictive to prescriptive analytics. Management Science 66(3):1025–1044.
- Bertsimas D, Koduri N (2022) Data-driven optimization: A reproducing kernel hilbert space approach. Operations Research 70(1):454–471.
- Bertsimas D, Van Parys B (2022) Bootstrap robust prescriptive analytics. Mathematical Programming 195(1-2):39–78.
- Butler A, Kwon RH (2023a) Efficient differentiable quadratic programming layers: an ADMM approach. Computational Optimization and Applications 84(2):449–476.
- Butler A, Kwon RH (2023b) Integrating prediction in mean-variance portfolio optimization. Quantitative Finance 23(3):429–452.
- Chen R, Paschalidis IC (2018) A robust learning approach for regression models based on distributionally robust optimization. Journal of Machine Learning Research 19(13):1–48.
- Ciocan DF, Mišić VV (2022) Interpretable optimal stopping. Management Science 68(3):1616–1638.
- Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems 2(4):303–314.
- Davis D, Yin W (2017) A three-operator splitting scheme and its optimization applications. Set-Valued and Variational Analysis 25(4):829–858.
- Deng Y, Sen S (2022) Predictive stochastic programming. Computational Management Science 19(1):65–98.
- Domke J (2012) Generic methods for optimization-based modeling. International Conference on Artificial Intelligence and Statistics, 318–326 (PMLR).
- Doshi-Velez F, Kim B (2017) Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 .
- Elmachtoub AN, Grigas P (2022) Smart “Predict, then Optimize”. Management Science 68(1):9–26.
- Esteban-Pérez A, Morales JM (2022) Distributionally robust stochastic programs with side information based on trimmings. Mathematical Programming 195(1):1069–1105.
- Esteban-Pérez A, Morales JM (2023) Distributionally robust optimal power flow with contextual information. European Journal of Operational Research 306(3):1047–1058.
- Estes AS, Richard JPP (2023) Smart predict-then-optimize for two-stage linear programs with side information. INFORMS Journal on Optimization Forthcoming.
- Ho-Nguyen N, Kılınç-Karzan F (2022) Risk guarantees for end-to-end prediction and optimization processes. Management Science 68(12):8680–8698.
- Kallus N, Mao X (2022) Stochastic optimization forests. Management Science 69(4):1975–1994.
- Kaminski ME (2019) The right to explanation, explained. Berkeley Technology Law Journal 34(1):189–218.
- Kantorovich LV, Rubinshtein GS (1958) On a space of totally additive functions. Vestnik Leningradskogo Universiteta 13(7):52–59.
- Keshavarz P (2022) Interpretable Contextual Newsvendor Models: A Tree-Based Method to Solving Data-Driven Newsvendor Problems. Master’s thesis, University of Ottawa.
- Kullback S, Leibler RA (1951) On information and sufficiency. The Annals of Mathematical Statistics 22(1):79–86.
- Lassalle R (2018) Causal transport plans and their Monge–Kantorovich problems. Stochastic Analysis and Applications 36(3):452–484.
- Lawless C, Zhou A (2022) A note on task-aware loss via reweighing prediction loss by decision-regret. arXiv preprint arXiv:2211.05116 .
- Liu W, Zhang Z (2023) Solving data-driven newsvendor pricing problems with decision-dependent effect. arXiv preprint arXiv:2304.13924 .
- Martínez-de Albeniz V, Belkaid A (2021) Here comes the sun: Fashion goods retailing under weather fluctuations. European Journal of Operational Research 294(3):820–830.
- Mišić VV, Perakis G (2020) Data analytics in operations management: A review. Manufacturing & Service Operations Management 22(1):158–169.
- Nadaraya E (1964) On estimating regression. Theory of Probability & its Applications 9(1):141–142.
- Notz PM, Pibernik R (2022) Prescriptive analytics for flexible capacity management. Management Science 68(3):1756–1775.
- Ohmori S (2021) A predictive prescription using minimum volume k-nearest neighbor enclosing ellipsoid and robust optimization. Mathematics 9(2):119.
- Rahimian H, Pagnoncelli B (2022) Data-driven approximation of contextual chance-constrained stochastic programs. Available at https://optimization-online.org/?p=20569 .
- Rockafellar RT, Wets RJB (2009) Variational Analysis (Berlin: Springer).
- Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1(5):206–215.
- Rust J (1988) Maximum likelihood estimation of discrete control processes. SIAM Journal on Control and Optimization 26(5):1006–1024.
- Rychener Y, Sutter DKT (2023) End-to-end learning for stochastic optimization: A bayesian perspective. International Conference on Machine Learning.
- Sen S, Deng Y (2017) Learning enabled optimization: Towards a fusion of statistical learning and stochastic programming. Available at https://optimization-online.org/?p=14456 .
- Smith JE, Winkler RL (2006) The optimizer’s curse: Skepticism and postdecision surprise in decision analysis. Management Science 52(3):311–322.
- Tang B, Khalil EB (2022) PyEPO: A PyTorch-based end-to-end predict-then-optimize library for linear and integer programming. arXiv preprint arXiv:2206.14234 .
- Wager S, Athey S (2018) Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association 113(523):1228–1242.
- Watson G (1964) Smooth regression analysis. Sankhyā: The Indian Journal of Statistics, Series A 26(4):359–372.
- Yan R, Wang S, Fagerholt K (2020) A semi-“smart predict then optimize” (semi-SPO) method for efficient ship inspection. Transportation Research Part B: Methodological 142:100–125.