Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 86 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 19 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 84 tok/s Pro
Kimi K2 129 tok/s Pro
GPT OSS 120B 430 tok/s Pro
Claude Sonnet 4 37 tok/s Pro
2000 character limit reached

Learning to Cut via Hierarchical Sequence/Set Model for Efficient Mixed-Integer Programming (2404.12638v1)

Published 19 Apr 2024 in cs.AI

Abstract: Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), which formulate many important real-world applications. Cut selection heavily depends on (P1) which cuts to prefer and (P2) how many cuts to select. Although modern MILP solvers tackle (P1)-(P2) by human-designed heuristics, machine learning carries the potential to learn more effective heuristics. However, many existing learning-based methods learn which cuts to prefer, neglecting the importance of learning how many cuts to select. Moreover, we observe that (P3) what order of selected cuts to prefer significantly impacts the efficiency of MILP solvers as well. To address these challenges, we propose a novel hierarchical sequence/set model (HEM) to learn cut selection policies. Specifically, HEM is a bi-level model: (1) a higher-level module that learns how many cuts to select, (2) and a lower-level module -- that formulates the cut selection as a sequence/set to sequence learning problem -- to learn policies selecting an ordered subset with the cardinality determined by the higher-level module. To the best of our knowledge, HEM is the first data-driven methodology that well tackles (P1)-(P3) simultaneously. Experiments demonstrate that HEM significantly improves the efficiency of solving MILPs on eleven challenging MILP benchmarks, including two Huawei's real problems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (82)
  1. R. E. Bixby, M. Fenelon, Z. Gu, E. Rothberg, and R. Wunderling, “Mixed-integer programming: A progress report,” in The sharpest cut: the impact of Manfred Padberg and his work.   SIAM, 2004, pp. 309–325.
  2. T. Achterberg, “Constraint integer programming,” 2007.
  3. Z.-L. Chen, “Integrated production and outbound distribution scheduling: review and extensions,” Operations research, vol. 58, no. 1, pp. 130–148, 2010.
  4. G. Laporte, “Fifty years of vehicle routing,” Transportation science, vol. 43, no. 4, pp. 408–416, 2009.
  5. V. Nair, S. Bartunov, F. Gimeno, I. von Glehn, P. Lichocki, I. Lobov, B. O’Donoghue, N. Sonnerat, C. Tjandraatmadja, P. Wang et al., “Solving mixed integer programs using neural networks,” arXiv preprint arXiv:2012.13349, 2020.
  6. Gurobi, “Gurobi solver, https://www.gurobi.com/,” 2021.
  7. K. Bestuzheva, M. Besançon, W.-K. Chen, A. Chmiela, T. Donkiewicz, J. van Doornmalen, L. Eifler, O. Gaul, G. Gamrath, A. Gleixner et al., “The scip optimization suite 8.0,” arXiv preprint arXiv:2112.08872, 2021.
  8. FICO Xpress, “Xpress optimization suite, https://www.fico.com/en/products/fico-xpress-optimization,” 2020.
  9. A. H. Land and A. G. Doig, “An automatic method for solving discrete programming problems,” in 50 Years of Integer Programming 1958-2008.   Springer, 2010, pp. 105–132.
  10. R. Gomory, “An algorithm for the mixed integer problem,” RAND CORP SANTA MONICA CA, Tech. Rep., 1960.
  11. Y. Bengio, A. Lodi, and A. Prouvost, “Machine learning for combinatorial optimization: a methodological tour d’horizon,” European Journal of Operational Research, vol. 290, no. 2, pp. 405–421, 2021.
  12. M. Turner, T. Koch, F. Serrano, and M. Winkler, “Adaptive cut selection in mixed-integer linear programming,” arXiv preprint arXiv:2202.10962, 2022.
  13. M. B. Paulus, G. Zarpellon, A. Krause, L. Charlin, and C. Maddison, “Learning to cut by looking ahead: Cutting plane selection via imitation learning,” in Proceedings of the 39th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, K. Chaudhuri, S. Jegelka, L. Song, C. Szepesvari, G. Niu, and S. Sabato, Eds., vol. 162.   PMLR, 17–23 Jul 2022, pp. 17 584–17 600.
  14. F. Wesselmann and U. Stuhl, “Implementing cutting plane management and selection techniques,” in Technical Report.   University of Paderborn, 2012.
  15. S. S. Dey and M. Molinaro, “Theoretical challenges towards cutting-plane selection,” Mathematical Programming, vol. 170, no. 1, pp. 237–266, 2018.
  16. Y. Tang, S. Agrawal, and Y. Faenza, “Reinforcement learning for integer programming: Learning to cut,” in Proceedings of the 37th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, H. D. III and A. Singh, Eds., vol. 119.   PMLR, 13–18 Jul 2020, pp. 9367–9376.
  17. Z. Huang, K. Wang, F. Liu, H.-L. Zhen, W. Zhang, M. Yuan, J. Hao, Y. Yu, and J. Wang, “Learning to select cuts for efficient mixed-integer programming,” Pattern Recognition, vol. 123, p. 108353, 2022.
  18. O. Vinyals, S. Bengio, and M. Kudlur, “Order matters: Sequence to sequence for sets,” in 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, Y. Bengio and Y. LeCun, Eds., 2016. [Online]. Available: http://arxiv.org/abs/1511.06391
  19. A. Gleixner, G. Hendel, G. Gamrath, T. Achterberg, M. Bastubbe, T. Berthold, P. Christophel, K. Jarck, T. Koch, J. Linderoth et al., “Miplib 2017: data-driven compilation of the 6th mixed-integer programming library,” Mathematical Programming Computation, vol. 13, no. 3, pp. 443–490, 2021.
  20. Z. Wang, X. Li, J. Wang, Y. Kuang, M. Yuan, J. Zeng, Y. Zhang, and F. Wu, “Learning cut selection for mixed-integer linear programming via hierarchical sequence model,” in International Conference on Learning Representations, 2023. [Online]. Available: https://openreview.net/forum?id=Zob4P9bRNcK
  21. A. Lodi and G. Zarpellon, “On learning and branching: a survey,” Top, vol. 25, no. 2, pp. 207–236, 2017.
  22. S. Bowly, Q. Cappart, J. Charfreitag, L. Charlin, D. Chételat, A. Chmiela, J. Dumouchelle, M. Gasse, A. Gleixner, A. M, Kazachkov, E. B, Khalil, P. Lichocki, A. Lodi, M. Lubin, C. J, Maddison, C. Morris, D. J, Papageorgiou, A. Parjadis, S. Pokutta, A. Prouvost, L. Scavuzzo, and G. Zarpellon, “Machine learning for combinatorial optimization,” 2021. [Online]. Available: https://www.ecole.ai/2021/ml4co-competition/
  23. M. Gasse, D. Chetelat, N. Ferroni, L. Charlin, and A. Lodi, “Exact combinatorial optimization with graph convolutional neural networks,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32.   Curran Associates, Inc., 2019.
  24. A. Prouvost, J. Dumouchelle, L. Scavuzzo, M. Gasse, D. Chételat, and A. Lodi, “Ecole: A gym-like library for machine learning in combinatorial optimization solvers,” arXiv preprint arXiv:2011.06069, 2020.
  25. R. Baltean-Lugojan, P. Bonami, R. Misener, and A. Tramontani, “Scoring positive semidefinite cutting planes for quadratic optimization via trained neural networks,” preprint: http://www. optimization-online. org/DB_ HTML/2018/11/6943. html, 2019.
  26. E. Khalil, P. Le Bodic, L. Song, G. Nemhauser, and B. Dilkina, “Learning to branch in mixed integer programming,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1, Feb. 2016. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/10080
  27. M.-F. Balcan, T. Dick, T. Sandholm, and E. Vitercik, “Learning to branch,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80.   PMLR, 10–15 Jul 2018, pp. 344–353.
  28. G. Zarpellon, J. Jo, A. Lodi, and Y. Bengio, “Parameterizing branch-and-bound search trees to learn branching policies,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 5, 2021, pp. 3931–3939.
  29. H. He, H. Daume III, and J. M. Eisner, “Learning to search in branch and bound algorithms,” in Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger, Eds., vol. 27.   Curran Associates, Inc., 2014.
  30. A. Sabharwal, H. Samulowitz, and C. Reddy, “Guiding combinatorial optimization with uct,” in International conference on integration of artificial intelligence (AI) and operations research (OR) techniques in constraint programming.   Springer, 2012, pp. 356–361.
  31. M. Morabit, G. Desaulniers, and A. Lodi, “Machine-learning-based column selection for column generation,” Transportation Science, vol. 55, no. 4, pp. 815–831, 2021.
  32. E. B. Khalil, B. Dilkina, G. L. Nemhauser, S. Ahmed, and Y. Shao, “Learning to run heuristics in tree search.” in Ijcai, 2017, pp. 659–666.
  33. G. Hendel, M. Miltenberger, and J. Witzig, “Adaptive algorithmic behavior for solving mixed integer programs using bandit algorithms,” in Operations Research Proceedings 2018.   Springer, 2019, pp. 513–519.
  34. M.-F. F. Balcan, S. Prasad, T. Sandholm, and E. Vitercik, “Sample complexity of tree search configuration: Cutting planes and beyond,” Advances in Neural Information Processing Systems, vol. 34, pp. 4015–4027, 2021.
  35. M.-F. Balcan, S. Prasad, T. Sandholm, and E. Vitercik, “Structural analysis of branch-and-cut and the learnability of gomory mixed integer cuts,” arXiv preprint arXiv:2204.07312, 2022.
  36. J. E. Mitchell, “Branch-and-cut algorithms for combinatorial optimization problems,” Handbook of applied optimization, vol. 1, no. 1, pp. 65–77, 2002.
  37. Z. Cao, Y. Xu, Z. Huang, and S. Zhou, “Ml4co-kida: Knowledge inheritance in dataset aggregation,” arXiv preprint arXiv:2201.10328, 2022.
  38. R. E. Bixby, “Implementing the simplex method: The initial basis,” ORSA Journal on Computing, vol. 4, no. 3, pp. 267–284, 1992.
  39. A. Lodi and A. Tramontani, “Performance variability in mixed-integer programming,” in Theory driven by influential applications.   INFORMS, 2013, pp. 1–12.
  40. X. Li, Q. Qu, F. Zhu, J. Zeng, M. Yuan, K. Mao, and J. Wang, “Learning to reformulate for linear programming,” arXiv preprint arXiv:2201.06216, 2022.
  41. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  42. T. Haarnoja, A. Zhou, P. Abbeel, and S. Levine, “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80.   PMLR, 10–15 Jul 2018, pp. 1861–1870.
  43. S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
  44. I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in Neural Information Processing Systems, Z. Ghahramani, M. Welling, C. Cortes, N. Lawrence, and K. Weinberger, Eds., vol. 27.   Curran Associates, Inc., 2014.
  45. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30.   Curran Associates, Inc., 2017.
  46. O. Vinyals, M. Fortunato, and N. Jaitly, “Pointer networks,” in Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., vol. 28.   Curran Associates, Inc., 2015.
  47. I. Bello*, H. Pham*, Q. V. Le, M. Norouzi, and S. Bengio, “Neural combinatorial optimization with reinforcement learning,” 2017.
  48. R. S. Sutton, D. McAllester, S. Singh, and Y. Mansour, “Policy gradient methods for reinforcement learning with function approximation,” Advances in neural information processing systems, vol. 12, 1999.
  49. V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in Proceedings of The 33rd International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, M. F. Balcan and K. Q. Weinberger, Eds., vol. 48.   New York, New York, USA: PMLR, 20–22 Jun 2016, pp. 1928–1937.
  50. J. D. M.-W. C. Kenton and L. K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of NAACL-HLT, 2019, pp. 4171–4186.
  51. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” in International Conference on Learning Representations, 2020.
  52. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10 012–10 022.
  53. R. Houthooft, X. Chen, Y. Duan, J. Schulman, F. De Turck, and P. Abbeel, “Vime: Variational information maximizing exploration,” Advances in neural information processing systems, vol. 29, 2016.
  54. Y. Burda, H. Edwards, A. Storkey, and O. Klimov, “Exploration by random network distillation,” in International Conference on Learning Representations, 2018.
  55. Z. Wang, T. Pan, Q. Zhou, and J. Wang, “Efficient exploration in resource-restricted reinforcement learning,” arXiv preprint arXiv:2212.06988, 2022.
  56. R. S. Sutton, D. Precup, and S. Singh, “Between mdps and semi-mdps: A framework for temporal abstraction in reinforcement learning,” Artificial intelligence, vol. 112, no. 1-2, pp. 181–211, 1999.
  57. O. Nachum, S. S. Gu, H. Lee, and S. Levine, “Data-efficient hierarchical reinforcement learning,” Advances in neural information processing systems, vol. 31, 2018.
  58. T. Salimans, J. Ho, X. Chen, S. Sidor, and I. Sutskever, “Evolution strategies as a scalable alternative to reinforcement learning,” arXiv preprint arXiv:1703.03864, 2017.
  59. J. Schulman, S. Levine, P. Abbeel, M. Jordan, and P. Moritz, “Trust region policy optimization,” in Proceedings of the 32nd International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, F. Bach and D. Blei, Eds., vol. 37.   Lille, France: PMLR, 07–09 Jul 2015, pp. 1889–1897.
  60. J. Neto, “A simple finite cutting plane algorithm for integer programs,” Operations research letters, vol. 40, no. 6, pp. 578–580, 2012.
  61. J. B. Orlin, “A finitely converging cutting plane technique,” Operations research letters, vol. 4, no. 1, pp. 1–3, 1985.
  62. H. Isermann, “Linear lexicographic optimization,” Operations-Research-Spektrum, vol. 4, no. 4, pp. 223–228, 1982.
  63. E. Balas and A. Ho, “Set covering algorithms using cutting planes, heuristics, and subgradient optimization: a computational study,” in Combinatorial Optimization.   Springer, 1980, pp. 37–60.
  64. L. Scavuzzo, F. Y. Chen, D. Chételat, M. Gasse, A. Lodi, N. Yorke-Smith, and K. Aardal, “Learning to branch with tree mdps,” arXiv preprint arXiv:2205.11107, 2022.
  65. H. Sun, W. Chen, H. Li, and L. Song, “Improving learning to branch via reinforcement learning,” in Learning Meets Combinatorial Algorithms at NeurIPS2020, 2020.
  66. A. Atamtürk, “On the facets of the mixed–integer knapsack polyhedron,” Mathematical Programming, vol. 98, no. 1, pp. 145–175, 2003.
  67. C. P. Gomes, W.-J. v. Hoeve, and A. Sabharwal, “Connections in networks: A hybrid approach,” in International Conference on Integration of Artificial Intelligence (AI) and Operations Research (OR) Techniques in Constraint Programming.   Springer, 2008, pp. 303–307.
  68. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
  69. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, Eds., vol. 32.   Curran Associates, Inc., 2019.
  70. Z. Gu, G. L. Nemhauser, and M. W. Savelsbergh, “Lifted cover inequalities for 0-1 integer programs: Computation,” INFORMS Journal on Computing, vol. 10, no. 4, pp. 427–437, 1998.
  71. ——, “Lifted flow cover inequalities for mixed 0-1 integer programs,” Mathematical Programming, vol. 85, no. 3, pp. 439–467, 1999.
  72. P. Gupta, E. B. Khalil, D. Chetélat, M. Gasse, Y. Bengio, A. Lodi, and M. P. Kumar, “Lookback for learning to branch,” arXiv preprint arXiv:2206.14987, 2022.
  73. F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Automated configuration of mixed integer programming solvers,” in International Conference on Integration of Artificial Intelligence (AI) and Operations Research (OR) Techniques in Constraint Programming.   Springer, 2010, pp. 186–202.
  74. S. Fujimoto, H. van Hoof, and D. Meger, “Addressing function approximation error in actor-critic methods,” in Proceedings of the 35th International Conference on Machine Learning, ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80.   PMLR, 10–15 Jul 2018, pp. 1587–1596.
  75. V. Konda and J. Tsitsiklis, “On actor-critic algorithms,” SIAM Journal on Control and Optimization, vol. 42, no. 4, pp. 1143–1166, 2003.
  76. V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015.
  77. X. Chen, C. Wang, Z. Zhou, and K. W. Ross, “Randomized ensembled double q-learning: Learning fast without a model,” in International Conference on Learning Representations, 2021. [Online]. Available: https://openreview.net/forum?id=AY8zfZm0tDd
  78. S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747, 2016.
  79. M. J. Saltzman, “Coin-or: an open-source library for optimization,” in Programming languages and systems in computational economics and finance.   Springer, 2002, pp. 3–32.
  80. B. Bixby, “The gurobi optimizer,” Transp. Re-search Part B, vol. 41, no. 2, pp. 159–178, 2007.
  81. C. Bliek1ú, P. Bonami, and A. Lodi, “Solving mixed-integer quadratic programming problems with ibm-cplex: a progress report,” in Proceedings of the twenty-sixth RAMP symposium, 2014, pp. 16–17.
  82. Huawei, “Optverse ai solver, https://www.huaweicloud.com/product/modelarts/optverse.html,” 2021.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

Lightbulb On Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.