Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 168 tok/s
Gemini 2.5 Pro 49 tok/s Pro
GPT-5 Medium 37 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 99 tok/s Pro
Kimi K2 214 tok/s Pro
GPT OSS 120B 466 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Differentiating Through Integer Linear Programs with Quadratic Regularization and Davis-Yin Splitting (2301.13395v4)

Published 31 Jan 2023 in cs.LG

Abstract: In many applications, a combinatorial problem must be repeatedly solved with similar, but distinct parameters. Yet, the parameters $w$ are not directly observed; only contextual data $d$ that correlates with $w$ is available. It is tempting to use a neural network to predict $w$ given $d$. However, training such a model requires reconciling the discrete nature of combinatorial optimization with the gradient-based frameworks used to train neural networks. We study the case where the problem in question is an Integer Linear Program (ILP). We propose applying a three-operator splitting technique, also known as Davis-Yin splitting (DYS), to the quadratically regularized continuous relaxation of the ILP. We prove that the resulting scheme is compatible with the recently introduced Jacobian-free backpropagation (JFB). Our experiments on two representative ILPs: the shortest path problem and the knapsack problem, demonstrate that this combination-DYS on the forward pass, JFB on the backward pass-yields a scheme which scales more effectively to high-dimensional problems than existing schemes. All code associated with this paper is available at github.com/mines-opt-ml/fpo-dys.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. Tensorflow: a system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp.  265–283, 2016.
  2. Differentiable convex optimization layers. In Advances in Neural Information Processing Systems, 2019a.
  3. Differentiable convex optimization layers. Advances in neural information processing systems, 32, 2019b.
  4. Differentiating through a cone program. arXiv preprint arXiv:1904.09043, 2019c.
  5. Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, pp.  136–145. PMLR, 2017.
  6. Deep equilibrium models. Advances in Neural Information Processing Systems, 32, 2019.
  7. Quelques propriétés des opérateurs angle-bornés et n-cycliquement monotones. Israel Journal of Mathematics, 26:137–150, 1977.
  8. The Baillon-Haddad theorem revisited. arXiv preprint arXiv:0906.0807, 2009.
  9. Yoshua Bengio. Using a financial training criterion rather than a prediction criterion. International journal of neural systems, 8(04):433–443, 1997.
  10. Learning with differentiable perturbed optimizers. Advances in neural information processing systems, 33:9508–9519, 2020.
  11. Numerical influence of ReLU’(0) on backpropagation. Advances in Neural Information Processing Systems, 34:468–479, 2021.
  12. Dimitri P Bertsekas. Nonlinear programming. Journal of the Operational Research Society, 48(3):334–334, 1997.
  13. Efficient and modular implicit differentiation. Advances in neural information processing systems, 35:5230–5242, 2022.
  14. Nonsmooth implicit differentiation for machine-learning and optimization. Advances in neural information processing systems, 34:13537–13549, 2021.
  15. One-step differentiation of iterative algorithms. Advances in Neural Information Processing Systems, 36, 2024.
  16. Jax: composable transformations of python+ numpy programs. 2018.
  17. Enforcing policy feasibility constraints through differentiable projection for energy optimization. In Proceedings of the Twelfth ACM International Conference on Future Energy Systems, pp.  199–210, 2021.
  18. Learning to optimize: A primer and a benchmark. Journal of Machine Learning Research, 23(189):1–59, 2022.
  19. Theoretical linear convergence of unfolded ista and its practical weights and thresholds. Advances in Neural Information Processing Systems, 31, 2018.
  20. FH Clarke. Optimization and nonsmooth analysis, wiley-interscience. New York, 1983.
  21. Laurent Condat. Fast projection onto the simplex and the ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ball. Mathematical Programming, 158(1):575–585, 2016.
  22. A three-operator splitting scheme and its optimization applications. Set-valued and variational analysis, 25(4):829–858, 2017.
  23. Efficient projections onto the ℓ1subscriptℓ1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT-ball for learning in high dimensions. In Proceedings of the 25th international conference on Machine learning, pp.  272–279, 2008.
  24. Implicit deep learning. SIAM Journal on Mathematics of Data Science, 3(3):930–958, 2021.
  25. Smart “predict, then optimize”. Management Science, 68(1):9–26, 2022.
  26. Mipaal: Mixed integer program as a layer. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp.  1504–1511, 2020.
  27. JFB: Jacobian-free backpropagation for implicit networks. In Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
  28. On training implicit models. Advances in Neural Information Processing Systems, 34:24247–24260, 2021.
  29. Deep equilibrium architectures for inverse problems in imaging. IEEE Transactions on Computational Imaging, 7:1123–1133, 2021.
  30. Jean Guyomarch. Warcraft ii open-source map editor, 2017. http://github. com/war2/war2edit.
  31. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.  770–778, 2016.
  32. Explainable ai via learning to optimize. Scientific Reports, 13(1):10103, 2023.
  33. Preface: New trends on combinatorial optimization for network and logistical applications. Annals of Operations Research, 298(1):1–5, 2021.
  34. Richard M Karp. Reducibility among combinatorial problems. In Complexity of computer computations, pp.  85–103. Springer, 1972.
  35. End-to-end constrained optimization learning: A survey. In 30th International Joint Conference on Artificial Intelligence, IJCAI 2021, pp.  4475–4482. International Joint Conferences on Artificial Intelligence, 2021.
  36. Ke Li and Jitendra Malik. Learning to optimize. In International Conference on Learning Representations, 2017.
  37. From the simplex to the sphere: faster constrained optimization using the hadamard parametrization. Information and Inference: A Journal of the IMA, 12(3):1898–1937, 2023.
  38. Reviving and improving recurrent back-propagation. In International Conference on Machine Learning, pp.  3082–3091. PMLR, 2018.
  39. Towards constituting mathematical structures for learning to optimize. In International Conference on Machine Learning, pp.  21426–21449. PMLR, 2023.
  40. Online deep equilibrium learning for regularization by denoising. Advances in Neural Information Processing Systems, 35:25363–25376, 2022.
  41. Interior point solving for lp-based prediction+ optimisation. Advances in Neural Information Processing Systems, 33:7272–7282, 2020.
  42. Operator splitting for learning to predict equilibria in convex games. arXiv e-prints, pp.  arXiv–2106, 2021.
  43. Random gradient-free minimization of convex functions. Foundations of Computational Mathematics, 17:527–566, 2017.
  44. Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 2019.
  45. Adaptive three operator splitting. In International Conference on Machine Learning, pp.  4085–4094. PMLR, 2018.
  46. Differentiation of blackbox combinatorial solvers. In International Conference on Learning Representations, 2019.
  47. Optimal experimental design for inverse problems with state constraints. SIAM Journal on Scientific Computing, 40(4):B1080–B1100, 2018.
  48. Large-Scale Convex Optimization: Algorithm Designs via Monotone Operators. Cambridge University Press, Cambridge, England, 2022.
  49. Backpropagation through combinatorial algorithms: Identity with projection works. In The Eleventh International Conference on Learning Representations, 2022.
  50. Combinatorial optimization and green logistics. Annals of Operations Research, 175(1):159–175, 2010.
  51. Bo Tang and Elias Boutros Khalil. Pyepo: A pytorch-based end-to-end predict-then-optimize library with linear objective function. In OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop), 2022.
  52. Vladimir Vapnik. The nature of statistical learning theory. Springer science & business media, 1999.
  53. Qi Wang and Chunlei Tang. Deep reinforcement learning for transportation network combinatorial optimization: A survey. Knowledge-Based Systems, 233:107526, 2021.
  54. Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pp.  1658–1665, 2019.
  55. Three operator splitting with a nonconvex loss function. In International Conference on Machine Learning, pp.  12267–12277. PMLR, 2021.
  56. Preface: Combinatorial optimization drives the future of health care. Journal of Combinatorial Optimization, 42(4):675–676, 2021.
  57. Günter M Ziegler. Lectures on polytopes, volume 152. Springer Science & Business Media, 2012.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.