Learning to optimize: A tutorial for continuous and mixed-integer optimization (2405.15251v1)
Abstract: Learning to Optimize (L2O) stands at the intersection of traditional optimization and machine learning, utilizing the capabilities of machine learning to enhance conventional optimization techniques. As real-world optimization problems frequently share common structures, L2O provides a tool to exploit these structures for better or faster solutions. This tutorial dives deep into L2O techniques, introducing how to accelerate optimization algorithms, promptly estimate the solutions, or even reshape the optimization problem itself, making it more adaptive to real-world applications. By considering the prerequisites for successful applications of L2O and the structure of the optimization problems at hand, this tutorial provides a comprehensive guide for practitioners and researchers alike.
- Balatsoukas-Stimming A, Studer C. Deep unfolding for communications systems: A survey and some new directions. In 2019 IEEE International Workshop on Signal Processing Systems (SiPS) 2019 Oct 20 (pp. 266-271). IEEE.
- Bergstra J, Bengio Y. Random search for hyper-parameter optimization. Journal of machine learning research. 2012 Feb 1;13(2).
- Berthold T. Primal heuristics for mixed integer programs (Doctoral dissertation, Zuse Institute Berlin (ZIB)).
- Bertsimas D, Kallus N. From predictive to prescriptive analytics. Management Science. 2020 Mar;66(3):1025-44.
- Condat L. A primal–dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms. Journal of optimization theory and applications. 2013 Aug;158(2):460-79.
- Davis D, Yin W. A three-operator splitting scheme and its optimization applications. Set-valued and variational analysis. 2017 Dec;25:829-58.
- Deza A, Khalil EB. Machine learning for cutting planes in integer programming: A survey. arXiv preprint arXiv:2302.09166. 2023 Feb 17.
- Elmachtoub AN, Grigas P. Smart “predict, then optimize”. Management Science. 2022 Jan;68(1):9-26.
- Ryu EK, Yin W. Large-scale convex optimization: algorithms & analyses via monotone operators. Cambridge University Press; 2022 Dec 1.
- Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association. 2001 Dec 1;96(456):1348-60.
- Fischetti M, Lodi A. Local branching. Mathematical programming. 2003 Sep;98:23-47.
- Gogna A, Tayal A. Metaheuristics: review and application. Journal of Experimental & Theoretical Artificial Intelligence. 2013 Dec 1;25(4):503-26.
- Gomory RE. An Algorithm for Integer Solutions to Lmear Programs. Princeton-IBM Mathematics Research Project Technical Report 1 (1958).
- Gomory RE. Solving linear programming problems in integers. Combinatorial Analysis 10 (1960): 211-215.
- Gregor K, LeCun Y. Learning fast approximations of sparse coding. In Proceedings of the 27th international conference on international conference on machine learning 2010 Jun 21 (pp. 399-406).
- Griewank A, Walther A. Evaluating derivatives: principles and techniques of algorithmic differentiation. Society for industrial and applied mathematics; 2008 Jan 1.
- Hendel G. Adaptive large neighborhood search for mixed integer programming. Mathematical Programming Computation. 2022 Jun;14(2):185-221.
- Hosny A, Reda S. Automatic MILP solver configuration by learning problem similarities. Annals of Operations Research. 2023 Jul 14:1-28.
- Hottung A, Tierney K. Neural large neighborhood search for the capacitated vehicle routing problem. arXiv preprint arXiv:1911.09539. 2019 Nov 21.
- Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning 2015 Jun 1 (pp. 448-456). PMLR.
- Mandi J, Guns T. Interior point solving for lp-based prediction+ optimisation. Advances in Neural Information Processing Systems. 2020;33:7272-82.
- Jegelka S. Theory of graph neural networks: Representation and learning. In The International Congress of Mathematicians 2022.
- Jia H, Shen S. Benders cut classification via support vector machines for solving two-stage stochastic programs. INFORMS Journal on Optimization. 2021 Jul;3(3):278-97.
- Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. 2014 Dec 22.
- Kouni V, Panagakis Y. DECONET: an Unfolding Network for Analysis-based Compressed Sensing with Generalization Error Bounds. IEEE Transactions on Signal Processing. 2023 May 3.
- Malitsky Y. Instance-specific algorithm configuration. Springer International Publishing; 2014.
- Moreau T, Bruna J. Understanding neural sparse coding with matrix factorization. In International Conference on Learning Representation (ICLR) 2017 Apr.
- Oberman AM, Calder J. Lipschitz regularized deep neural networks converge and generalize. arXiv preprint arXiv:1808.09540. 2018 Aug 28.
- Paulus M, Krause A. Learning to dive in branch and bound. Advances in Neural Information Processing Systems. 2024 Feb 13;36.
- Qian H, Wegman MN. L2-Nonexpansive Neural Networks. In International Conference on Learning Representations 2018 Sep 27.
- Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014 Sep 4.
- Takabe S, Wadayama T. Theoretical interpretation of learned step size in deep-unfolded gradient descent. arXiv preprint arXiv:2001.05142. 2020 Jan 15.
- Vũ BC. A splitting algorithm for dual monotone inclusions involving cocoercive operators. Advances in Computational Mathematics. 2013 Apr;38(3):667-81.
- Wadayama T, Takabe S. Deep learning-aided trainable projected gradient decoding for LDPC codes. In 2019 IEEE International Symposium on Information Theory (ISIT) 2019 Jul 7 (pp. 2444-2448). IEEE.
- Wolpert DH, Macready WG. No free lunch theorems for optimization. IEEE transactions on evolutionary computation. 1997 Apr;1(1):67-82.
- Wolsey LA. Integer programming. John Wiley & Sons; 2020 Sep 10.
- Yang L, Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing. 2020 Nov 20;415:295-316.
- Yilmaz K, Yorke-Smith N. A study of learning search approximation in mixed integer branch and bound: Node selection in scip. AI. 2021 Apr 12;2(2):150-78.
- Zhang J, Ghanem B. ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing. In Proceedings of the IEEE conference on computer vision and pattern recognition 2018 (pp. 1828-1837).