Score Function Gradient Estimation to Widen the Applicability of Decision-Focused Learning (2307.05213v2)
Abstract: Many real-world optimization problems contain parameters that are unknown before deployment time, either due to stochasticity or to lack of information (e.g., demand or travel times in delivery problems). A common strategy in such cases is to estimate said parameters via ML models trained to minimize the prediction error, which however is not necessarily aligned with the downstream task-level error. The decision-focused learning (DFL) paradigm overcomes this limitation by training to directly minimize a task loss, e.g. regret. Since the latter has non-informative gradients for combinatorial problems, state-of-the-art DFL methods introduce surrogates and approximations that enable training. But these methods exploit specific assumptions about the problem structures (e.g., convex or linear problems, unknown parameters only in the objective function). We propose an alternative method that makes no such assumptions, it combines stochastic smoothing with score function gradient estimation which works on any task loss. This opens up the use of DFL methods to nonlinear objectives, uncertain parameters in the problem constraints, and even two-stage stochastic optimization. Experiments show that it typically requires more epochs, but that it is on par with specialized methods and performs especially well for the difficult case of problems with uncertainty in the constraints, in terms of solution quality, scalability, or both.
- Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, pages 1658–1665. AAAI Press, 2019.
- Differentiable convex optimization layers. Advances in neural information processing systems, 32, 2019.
- Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, pages 136–145. PMLR, 2017.
- Smart “predict, then optimize”. Management Science, 68(1):9–26, 2022.
- Decision trees for decision-making under the predict-then-optimize framework. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 2858–2867. PMLR, 13–18 Jul 2020.
- Interior point solving for lp-based prediction+optimisation. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 7272–7282. Curran Associates, Inc., 2020.
- Decision-focused learning: Through the lens of learning to rank. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 14935–14947. PMLR, 17–23 Jul 2022.
- Contrastive losses and solution caching for predict-and-optimize. In 30th International Joint Conference on Artificial Intelligence (IJCAI-21): IJCAI-21, pages 2833–2840. International Joint Conferences on Artificial Intelligence, 2021.
- Decision-focused learning without decision-making: Learning locally optimized decision losses. In NeurIPS, 2022.
- Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn., 8:229–256, 1992.
- Predict+optimize for packing and covering lps with unknown parameters in constraints. CoRR, abs/2209.03668, 2022.
- Implicit mle: backpropagating through discrete exponential family distributions. Advances in Neural Information Processing Systems, 34:14567–14579, 2021.
- Felix Petersen. Learning with differentiable algorithms. arXiv preprint arXiv:2209.00616, 2022.
- Learning with differentiable pertubed optimizers. Advances in neural information processing systems, 33:9508–9519, 2020.
- Perturbation techniques in online learning and optimization. Perturbations, Optimization, and Statistics, 233, 2016.
- Monte carlo gradient estimation in machine learning. J. Mach. Learn. Res., 21(132):1–62, 2020.
- Comboptnet: Fit the right np-hard problem by learning integer programming constraints. In International Conference on Machine Learning, pages 8443–8453. PMLR, 2021.
- Exact algorithms for set multicover and multiset multicover problems. In Algorithms and Computation: 20th International Symposium, ISAAC 2009, Honolulu, Hawaii, USA, December 16-18, 2009. Proceedings 20, pages 34–44. Springer, 2009.
- Computational experience with approximation algorithms for the set covering problem. European journal of operational research, 101(1):81–92, 1997.
- Task-based end-to-end model learning in stochastic optimization. Advances in neural information processing systems, 30, 2017.
- The sample average approximation method for stochastic discrete optimization. SIAM Journal on Optimization, 12(2):479–502, 2002.