Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers (2402.13380v3)
Abstract: In this study, we introduce an innovative deep learning framework that employs a transformer model to address the challenges of mixed-integer programs, specifically focusing on the Capacitated Lot Sizing Problem (CLSP). Our approach, to our knowledge, is the first to utilize transformers to predict the binary variables of a mixed-integer programming (MIP) problem. Specifically, our approach harnesses the encoder decoder transformer's ability to process sequential data, making it well-suited for predicting binary variables indicating production setup decisions in each period of the CLSP. This problem is inherently dynamic, and we need to handle sequential decision making under constraints. We present an efficient algorithm in which CLSP solutions are learned through a transformer neural network. The proposed post-processed transformer algorithm surpasses the state-of-the-art solver, CPLEX and Long Short-Term Memory (LSTM) in solution time, optimal gap, and percent infeasibility over 240K benchmark CLSP instances tested. After the ML model is trained, conducting inference on the model, reduces the MIP into a linear program (LP). This transforms the ML-based algorithm, combined with an LP solver, into a polynomial-time approximation algorithm to solve a well-known NP-Hard problem, with almost perfect solution quality.
- A study of the lot-sizing polytope. Mathematical Programming, 99(3):443–465, 2004.
- Pasquale Avella et al. Off-the-shelf solvers for mixed-integer conic programming: insights from a computational study on congested capacitated facility location instances. arXiv preprint arXiv:2303.04216, 2023. URL https://ar5iv.org/pdf/2303.04216.pdf.
- Single item lot sizing problems. European Journal of Operational Research, 168:1–16, 2006. doi: 10.1016/j.ejor.2004.01.054. URL https://www.sciencedirect.com/science/article/pii/S0377221704003923.
- A temporal convolutional neural network (TCNN) approach to predicting capacitated lot-sizing solutions. Working paper, 2024.
- Aysu Ezen-Can. A comparison of lstm and bert for small corpus, 2020.
- David S. Johnson. A brief history of np-completeness 1954–2012. Documenta Mathematica, Extra Volume ISMP:359–376, 2012.
- Jelena Mitrovic Mehdi B. Amor, Michael Granitzer. Impact of position bias on language models in token classification, 2023.
- A comparative study of transformer-based language models on extractive question answering. arXiv, 2020.
- Attention is all you need. In Advances in Neural Information Processing Systems, volume 30, 2017.
- Transformers in time series: A survey. 32nd International Joint Conference on Artificial Intelligence, 2022.
- Learning optimal solutions via an LSTM-Optimization framework. Operations Research Forum, 4(2):48, 2023.
- An expandable machine learning-optimization framework to sequential decision-making. European Journal of Operational Research, 314(1):280–296, 2024.
- A comparison of transformer and lstm encoder decoder models for asr. In [Conference or Journal Name]. Human Language Technology and Pattern Recognition, Computer Science Department, RWTH Aachen University, Aachen, Germany; AppTek GmbH, Aachen, Germany, 2019.