Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates (2307.08816v2)
Abstract: Discrete optimization belongs to the set of $\mathcal{NP}$-hard problems, spanning fields such as mixed-integer programming and combinatorial optimization. A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms, which reach optimal solutions by iteratively adding inequalities known as \textit{cuts} to refine a feasible set. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. In this work, we propose a method for accelerating cutting-plane algorithms via reinforcement learning. Our approach uses learned policies as surrogates for $\mathcal{NP}$-hard elements of the cut generating procedure in a way that (i) accelerates convergence, and (ii) retains guarantees of optimality. We apply our method on two types of problems where cutting-plane algorithms are commonly used: stochastic optimization, and mixed-integer quadratic programming. We observe the benefits of our method when applied to Benders decomposition (stochastic optimization) and iterative loss approximation (quadratic programming), achieving up to $45\%$ faster average convergence when compared to modern alternative algorithms.
- Benders Decomposition for Production Routing Under Demand Uncertainty. Operations Research, 63(4): 851–867.
- Stabilized Benders Methods for Large-Scale Combinatorial Optimization, with Application to Data Privacy. Management Science, 66(7): 3051–3068.
- Benders, J. F. 1962. Partitioning procedures for solving mixed-variables programming problems. Numerische Mathematik, 4: 238–252.
- Best subset selection via a modern optimization lens.
- An algorithmic framework for convex mixed integer nonlinear programs. Discrete optimization, 5(2): 186–204.
- Partial Benders decomposition strategies for two-stage stochastic integer programs, volume 37. CIRRELT.
- Sparse regularization for fiber ODF reconstruction: From the suboptimality of ℓℓ\ellroman_ℓ2 and ℓℓ\ellroman_ℓ1 priors to ℓℓ\ellroman_ℓ0. Medical Image Analysis, 18(6): 820–833.
- Reinforcement learning with combinatorial actions: An application to vehicle routing. Advances in Neural Information Processing Systems, 33: 609–620.
- Sparse high-dimensional models in economics. Annu. Rev. Econ., 3(1): 291–317.
- Cardinality-regularized hawkes-granger model. Advances in Neural Information Processing Systems, 34: 2682–2694.
- Jeroslow, R. C. 1973. There cannot be any algorithm for integer programming with quadratic constraints. Operations Research, 21(1): 221–224.
- Accelerating Generalized Benders Decomposition for Wireless Resource Allocation. IEEE Transactions on Wireless Communications, 20(2): 1233–1247.
- Learning Sparse Neural Networks through L_0 Regularization. In International Conference on Learning Representations.
- Accelerating Benders decomposition: Algorithmic enhancement and model selection criteria. Operations Research, 29(3): 464–484.
- Reinforcement learning for combinatorial optimization: A survey. Computers & Operations Research, 134: 105400.
- Solving mixed integer programs using neural networks. arXiv preprint arXiv:2012.13349.
- Discrete optimization. Elsevier.
- Improving benders decomposition using a genetic algorithm. European Journal of Operational Research, 199(1): 89–97.
- The Benders decomposition algorithm: A literature review. European Journal of Operational Research, 259(3): 801–817.
- High-Dimensional Continuous Control Using Generalized Advantage Estimation. In Proceedings of the International Conference on Learning Representations (ICLR).
- Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.
- Reinforcement Learning: An Introduction. The MIT Press, second edition.
- Policy Gradient Methods for Reinforcement Learning with Function Approximation. In Solla, S.; Leen, T.; and Müller, K., eds., Advances in Neural Information Processing Systems, volume 12. MIT Press.
- Reinforcement learning for integer programming: Learning to cut. In International conference on machine learning, 9367–9376. PMLR.
- Learning Cut Selection for Mixed-Integer Linear Programming via Hierarchical Sequence Model. In The Eleventh International Conference on Learning Representations.
- Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society Series B: Statistical Methodology, 67(2): 301–320.