Differentiating Through Integer Linear Programs with Quadratic Regularization and Davis-Yin Splitting (2301.13395v4)
Abstract: In many applications, a combinatorial problem must be repeatedly solved with similar, but distinct parameters. Yet, the parameters $w$ are not directly observed; only contextual data $d$ that correlates with $w$ is available. It is tempting to use a neural network to predict $w$ given $d$. However, training such a model requires reconciling the discrete nature of combinatorial optimization with the gradient-based frameworks used to train neural networks. We study the case where the problem in question is an Integer Linear Program (ILP). We propose applying a three-operator splitting technique, also known as Davis-Yin splitting (DYS), to the quadratically regularized continuous relaxation of the ILP. We prove that the resulting scheme is compatible with the recently introduced Jacobian-free backpropagation (JFB). Our experiments on two representative ILPs: the shortest path problem and the knapsack problem, demonstrate that this combination-DYS on the forward pass, JFB on the backward pass-yields a scheme which scales more effectively to high-dimensional problems than existing schemes. All code associated with this paper is available at github.com/mines-opt-ml/fpo-dys.
- Tensorflow: a system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp. 265–283, 2016.
- Differentiable convex optimization layers. In Advances in Neural Information Processing Systems, 2019a.
- Differentiable convex optimization layers. Advances in neural information processing systems, 32, 2019b.
- Differentiating through a cone program. arXiv preprint arXiv:1904.09043, 2019c.
- Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, pp. 136–145. PMLR, 2017.
- Deep equilibrium models. Advances in Neural Information Processing Systems, 32, 2019.
- Quelques propriétés des opérateurs angle-bornés et n-cycliquement monotones. Israel Journal of Mathematics, 26:137–150, 1977.
- The Baillon-Haddad theorem revisited. arXiv preprint arXiv:0906.0807, 2009.
- Yoshua Bengio. Using a financial training criterion rather than a prediction criterion. International journal of neural systems, 8(04):433–443, 1997.
- Learning with differentiable perturbed optimizers. Advances in neural information processing systems, 33:9508–9519, 2020.
- Numerical influence of ReLU’(0) on backpropagation. Advances in Neural Information Processing Systems, 34:468–479, 2021.
- Dimitri P Bertsekas. Nonlinear programming. Journal of the Operational Research Society, 48(3):334–334, 1997.
- Efficient and modular implicit differentiation. Advances in neural information processing systems, 35:5230–5242, 2022.
- Nonsmooth implicit differentiation for machine-learning and optimization. Advances in neural information processing systems, 34:13537–13549, 2021.
- One-step differentiation of iterative algorithms. Advances in Neural Information Processing Systems, 36, 2024.
- Jax: composable transformations of python+ numpy programs. 2018.
- Enforcing policy feasibility constraints through differentiable projection for energy optimization. In Proceedings of the Twelfth ACM International Conference on Future Energy Systems, pp. 199–210, 2021.
- Learning to optimize: A primer and a benchmark. Journal of Machine Learning Research, 23(189):1–59, 2022.
- Theoretical linear convergence of unfolded ista and its practical weights and thresholds. Advances in Neural Information Processing Systems, 31, 2018.
- FH Clarke. Optimization and nonsmooth analysis, wiley-interscience. New York, 1983.
- Laurent Condat. Fast projection onto the simplex and the ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ball. Mathematical Programming, 158(1):575–585, 2016.
- A three-operator splitting scheme and its optimization applications. Set-valued and variational analysis, 25(4):829–858, 2017.
- Efficient projections onto the ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-ball for learning in high dimensions. In Proceedings of the 25th international conference on Machine learning, pp. 272–279, 2008.
- Implicit deep learning. SIAM Journal on Mathematics of Data Science, 3(3):930–958, 2021.
- Smart “predict, then optimize”. Management Science, 68(1):9–26, 2022.
- Mipaal: Mixed integer program as a layer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp. 1504–1511, 2020.
- JFB: Jacobian-free backpropagation for implicit networks. In Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
- On training implicit models. Advances in Neural Information Processing Systems, 34:24247–24260, 2021.
- Deep equilibrium architectures for inverse problems in imaging. IEEE Transactions on Computational Imaging, 7:1123–1133, 2021.
- Jean Guyomarch. Warcraft ii open-source map editor, 2017. http://github. com/war2/war2edit.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- Explainable ai via learning to optimize. Scientific Reports, 13(1):10103, 2023.
- Preface: New trends on combinatorial optimization for network and logistical applications. Annals of Operations Research, 298(1):1–5, 2021.
- Richard M Karp. Reducibility among combinatorial problems. In Complexity of computer computations, pp. 85–103. Springer, 1972.
- End-to-end constrained optimization learning: A survey. In 30th International Joint Conference on Artificial Intelligence, IJCAI 2021, pp. 4475–4482. International Joint Conferences on Artificial Intelligence, 2021.
- Ke Li and Jitendra Malik. Learning to optimize. In International Conference on Learning Representations, 2017.
- From the simplex to the sphere: faster constrained optimization using the hadamard parametrization. Information and Inference: A Journal of the IMA, 12(3):1898–1937, 2023.
- Reviving and improving recurrent back-propagation. In International Conference on Machine Learning, pp. 3082–3091. PMLR, 2018.
- Towards constituting mathematical structures for learning to optimize. In International Conference on Machine Learning, pp. 21426–21449. PMLR, 2023.
- Online deep equilibrium learning for regularization by denoising. Advances in Neural Information Processing Systems, 35:25363–25376, 2022.
- Interior point solving for lp-based prediction+ optimisation. Advances in Neural Information Processing Systems, 33:7272–7282, 2020.
- Operator splitting for learning to predict equilibria in convex games. arXiv e-prints, pp. arXiv–2106, 2021.
- Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17:527–566, 2017.
- Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
- Adaptive three operator splitting. In International Conference on Machine Learning, pp. 4085–4094. PMLR, 2018.
- Differentiation of blackbox combinatorial solvers. In International Conference on Learning Representations, 2019.
- Optimal experimental design for inverse problems with state constraints. SIAM Journal on Scientific Computing, 40(4):B1080–B1100, 2018.
- Large-Scale Convex Optimization: Algorithm Designs via Monotone Operators. Cambridge University Press, Cambridge, England, 2022.
- Backpropagation through combinatorial algorithms: Identity with projection works. In The Eleventh International Conference on Learning Representations, 2022.
- Combinatorial optimization and green logistics. Annals of Operations Research, 175(1):159–175, 2010.
- Bo Tang and Elias Boutros Khalil. Pyepo: A pytorch-based end-to-end predict-then-optimize library with linear objective function. In OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop), 2022.
- Vladimir Vapnik. The nature of statistical learning theory. Springer science & business media, 1999.
- Qi Wang and Chunlei Tang. Deep reinforcement learning for transportation network combinatorial optimization: A survey. Knowledge-Based Systems, 233:107526, 2021.
- Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp. 1658–1665, 2019.
- Three operator splitting with a nonconvex loss function. In International Conference on Machine Learning, pp. 12267–12277. PMLR, 2021.
- Preface: Combinatorial optimization drives the future of health care. Journal of Combinatorial Optimization, 42(4):675–676, 2021.
- Günter M Ziegler. Lectures on polytopes, volume 152. Springer Science & Business Media, 2012.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.