Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Toward TransfORmers: Revolutionizing the Solution of Mixed Integer Programs with Transformers (2402.13380v3)

Published 20 Feb 2024 in cs.AI, cs.LG, math.CO, math.OC, and stat.ML

Abstract: In this study, we introduce an innovative deep learning framework that employs a transformer model to address the challenges of mixed-integer programs, specifically focusing on the Capacitated Lot Sizing Problem (CLSP). Our approach, to our knowledge, is the first to utilize transformers to predict the binary variables of a mixed-integer programming (MIP) problem. Specifically, our approach harnesses the encoder decoder transformer's ability to process sequential data, making it well-suited for predicting binary variables indicating production setup decisions in each period of the CLSP. This problem is inherently dynamic, and we need to handle sequential decision making under constraints. We present an efficient algorithm in which CLSP solutions are learned through a transformer neural network. The proposed post-processed transformer algorithm surpasses the state-of-the-art solver, CPLEX and Long Short-Term Memory (LSTM) in solution time, optimal gap, and percent infeasibility over 240K benchmark CLSP instances tested. After the ML model is trained, conducting inference on the model, reduces the MIP into a linear program (LP). This transforms the ML-based algorithm, combined with an LP solver, into a polynomial-time approximation algorithm to solve a well-known NP-Hard problem, with almost perfect solution quality.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. A study of the lot-sizing polytope. Mathematical Programming, 99(3):443–465, 2004.
  2. Pasquale Avella et al. Off-the-shelf solvers for mixed-integer conic programming: insights from a computational study on congested capacitated facility location instances. arXiv preprint arXiv:2303.04216, 2023. URL https://ar5iv.org/pdf/2303.04216.pdf.
  3. Single item lot sizing problems. European Journal of Operational Research, 168:1–16, 2006. doi: 10.1016/j.ejor.2004.01.054. URL https://www.sciencedirect.com/science/article/pii/S0377221704003923.
  4. A temporal convolutional neural network (TCNN) approach to predicting capacitated lot-sizing solutions. Working paper, 2024.
  5. Aysu Ezen-Can. A comparison of lstm and bert for small corpus, 2020.
  6. David S. Johnson. A brief history of np-completeness 1954–2012. Documenta Mathematica, Extra Volume ISMP:359–376, 2012.
  7. Jelena Mitrovic Mehdi B. Amor, Michael Granitzer. Impact of position bias on language models in token classification, 2023.
  8. A comparative study of transformer-based language models on extractive question answering. arXiv, 2020.
  9. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30, 2017.
  10. Transformers in time series: A survey. 32nd International Joint Conference on Artificial Intelligence, 2022.
  11. Learning optimal solutions via an LSTM-Optimization framework. Operations Research Forum, 4(2):48, 2023.
  12. An expandable machine learning-optimization framework to sequential decision-making. European Journal of Operational Research, 314(1):280–296, 2024.
  13. A comparison of transformer and lstm encoder decoder models for asr. In [Conference or Journal Name]. Human Language Technology and Pattern Recognition, Computer Science Department, RWTH Aachen University, Aachen, Germany; AppTek GmbH, Aachen, Germany, 2019.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com