Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transformer-based Stagewise Decomposition for Large-Scale Multistage Stochastic Optimization (2404.02583v1)

Published 3 Apr 2024 in cs.LG

Abstract: Solving large-scale multistage stochastic programming (MSP) problems poses a significant challenge as commonly used stagewise decomposition algorithms, including stochastic dual dynamic programming (SDDP), face growing time complexity as the subproblem size and problem count increase. Traditional approaches approximate the value functions as piecewise linear convex functions by incrementally accumulating subgradient cutting planes from the primal and dual solutions of stagewise subproblems. Recognizing these limitations, we introduce TranSDDP, a novel Transformer-based stagewise decomposition algorithm. This innovative approach leverages the structural advantages of the Transformer model, implementing a sequential method for integrating subgradient cutting planes to approximate the value function. Through our numerical experiments, we affirm TranSDDP's effectiveness in addressing MSP problems. It efficiently generates a piecewise linear approximation for the value function, significantly reducing computation time while preserving solution quality, thus marking a promising progression in the treatment of large-scale multistage stochastic programming problems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. Deep value function networks for large-scale multistage stochastic programs. In International Conference on Artificial Intelligence and Statistics, pp.  11267–11287. PMLR, 2023.
  2. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
  3. Optimal booking and scheduling in outpatient procedure centers. Computers & Operations Research, 50:24–37, 2014.
  4. Convex optimization. Cambridge university press, 2004.
  5. The russell-yasuda kasai model: An asset/liability model for a japanese insurance company using multistage stochastic programming. Interfaces, 24(1):29–49, 1994.
  6. Dual decomposition in stochastic integer programming. Operations Research Letters, 24(1-2):37–45, 1999.
  7. Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34:15084–15097, 2021.
  8. Neural stochastic dual dynamic programming. arXiv preprint arXiv:2112.00874, 2021.
  9. Improving the performance of stochastic dual dynamic programming. Journal of Computational and Applied Mathematics, 290:196–208, 2015.
  10. Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp.  5884–5888. IEEE, 2018.
  11. A progressive hedging method for the optimization of social engagement and opportunistic iot problems. European Journal of Operational Research, 277(2):643–652, 2019.
  12. Short-term hydropower production planning by stochastic programming. Computers & Operations Research, 35(8):2656–2671, 2008.
  13. Video action transformer network. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.  244–253, 2019.
  14. Guigues, V. Sddp for some interstage dependent risk-averse problems and application to hydro-thermal planning. Computational Optimization and Applications, 57(1):167–203, 2014.
  15. Dnabert: pre-trained bidirectional encoder representations from transformers model for dna-language in genome. Bioinformatics, 37(15):2112–2120, 2021.
  16. The capacitated lot sizing problem: a review of models and algorithms. Omega, 31(5):365–378, 2003.
  17. Structured attention networks. arXiv preprint arXiv:1702.00887, 2017.
  18. Attention, learn to solve routing problems! arXiv preprint arXiv:1803.08475, 2018.
  19. Value function gradient learning for large-scale multistage stochastic programming problems. European Journal of Operational Research, 2022. ISSN 0377-2217. doi: https://doi.org/10.1016/j.ejor.2022.10.011. URL https://www.sciencedirect.com/science/article/pii/S0377221722007809.
  20. Constrained decision transformer for offline safe reinforcement learning. arXiv preprint arXiv:2302.07351, 2023.
  21. Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025, 2015.
  22. Merton, R. C. Lifetime portfolio selection under uncertainty: The continuous-time case. The review of Economics and Statistics, pp.  247–257, 1969.
  23. Stabilizing transformers for reinforcement learning. In International conference on machine learning, pp.  7487–7498. PMLR, 2020.
  24. Multi-stage stochastic optimization applied to energy planning. Mathematical programming, 52(1):359–375, 1991.
  25. Two methods of pruning Benders’ cuts and their application to the management of a gas portfolio. PhD thesis, INRIA, 2012.
  26. Scenarios and policy aggregation in optimization under uncertainty. Mathematics of operations research, 16(1):119–147, 1991.
  27. Shapiro, A. Analysis of stochastic dual dynamic programming method. European Journal of Operational Research, 209(1):63–72, 2011.
  28. Lectures on stochastic programming: modeling and theory. SIAM, 2021.
  29. Shapiro, J. F. Mathematical programming models and methods for production planning and scheduling. Handbooks in operations research and management science, 4:371–443, 1993.
  30. Multistage stochastic programming model for electric power capacity expansion problem. Japan journal of industrial and applied mathematics, 20(3):379–397, 2003.
  31. Optimal capacity allocation in multi-auction electricity markets under uncertainty. Computers & operations research, 32(2):201–217, 2005.
  32. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  33. Dynamic version of the economic lot size model. Management science, 5(1):89–96, 1958.
  34. Learning deep transformer models for machine translation. arXiv preprint arXiv:1906.01787, 2019.
Citations (2)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets