Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 175 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 37 tok/s Pro
GPT-4o 108 tok/s Pro
Kimi K2 180 tok/s Pro
GPT OSS 120B 447 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Task-optimal data-driven surrogate models for eNMPC via differentiable simulation and optimization (2403.14425v3)

Published 21 Mar 2024 in cs.LG and math.OC

Abstract: Mechanistic dynamic process models may be too computationally expensive to be usable as part of a real-time capable predictive controller. We present a method for end-to-end learning of Koopman surrogate models for optimal performance in a specific control task. In contrast to previous contributions that employ standard reinforcement learning (RL) algorithms, we use a training algorithm that exploits the differentiability of environments based on mechanistic simulation models to aid the policy optimization. We evaluate the performance of our method by comparing it to that of other training algorithms on an existing economic nonlinear model predictive control (eNMPC) case study of a continuous stirred-tank reactor (CSTR) model. Compared to the benchmark methods, our method produces similar economic performance while eliminating constraint violations. Thus, for this case study, our method outperforms the others and offers a promising path toward more performant controllers that employ dynamic surrogate models.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. K. McBride and K. Sundmacher, “Overview of surrogate modeling in chemical process engineering,” Chemie Ingenieur Technik, vol. 91, no. 3, pp. 228–239, 2019.
  2. B. Chen, Z. Cai, and M. Bergés, “Gnu-RL: A precocial reinforcement learning solution for building HVAC control using a differentiable MPC policy,” in Proceedings of the 6th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, 2019, pp. 316–325.
  3. S. Gros and M. Zanon, “Data-driven economic NMPC using reinforcement learning,” IEEE Transactions on Automatic Control, vol. 65, no. 2, pp. 636–648, 2019.
  4. D. Mayfrank, A. Mitsos, and M. Dahmen, “End-to-end reinforcement learning of Koopman models for economic nonlinear MPC,” arXiv preprint arXiv:2308.01674, 2023.
  5. A. R. Mahmood, D. Korenkevych, G. Vasan, W. Ma, and J. Bergstra, “Benchmarking reinforcement learning algorithms on real-world robots,” in Conference on Robot Learning.   PMLR, 2018, pp. 561–591.
  6. M. A. Z. Mora, M. Peychev, S. Ha, M. Vechev, and S. Coros, “Pods: Policy optimization via differentiable simulation,” in International Conference on Machine Learning.   PMLR, 2021, pp. 7805–7817.
  7. J. Xu, V. Makoviychuk, Y. Narang, F. Ramos, W. Matusik, A. Garg, and M. Macklin, “Accelerated policy learning with parallel differentiable simulation,” arXiv preprint arXiv:2204.07137, 2022.
  8. A. Ilyas, L. Engstrom, S. Santurkar, D. Tsipras, F. Janoos, L. Rudolph, and A. Madry, “A closer look at deep policy gradients,” arXiv preprint arXiv:1811.02553, 2018.
  9. S. Wu, L. Shi, J. Wang, and G. Tian, “Understanding policy gradient algorithms: A sensitivity-based approach,” in International Conference on Machine Learning.   PMLR, 2022, pp. 24 131–24 149.
  10. R. Islam, P. Henderson, M. Gomrokchi, and D. Precup, “Reproducibility of benchmarked deep reinforcement learning tasks for continuous control,” arXiv preprint arXiv:1708.04133, 2017.
  11. P. Henderson, R. Islam, P. Bachman, J. Pineau, D. Precup, and D. Meger, “Deep reinforcement learning that matters,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, 2018, pp. 3207–3214.
  12. P. Henderson, J. Romoff, and J. Pineau, “Where did my optimum go?: An empirical analysis of gradient descent optimization in policy gradient methods,” arXiv preprint arXiv:1810.02525, 2018.
  13. R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud, “Neural ordinary differential equations,” Advances in Neural Information Processing Systems, vol. 31, pp. 6572–6583, 2018.
  14. P. J. Werbos, “Backpropagation through time: what it does and how to do it,” Proceedings of the IEEE, vol. 78, no. 10, pp. 1550–1560, 1990.
  15. Z. Huang, Y. Hu, T. Du, S. Zhou, H. Su, J. B. Tenenbaum, and C. Gan, “Plasticinelab: A soft-body manipulation benchmark with differentiable physics,” arXiv preprint arXiv:2104.03311, 2021.
  16. B. Amos, I. Jimenez, J. Sacks, B. Boots, and J. Z. Kolter, “Differentiable MPC for end-to-end planning and control,” Advances in Neural Information Processing Systems, vol. 31, pp. 8299–8310, 2018.
  17. A. Flores-Tlacuahuac and I. E. Grossmann, “Simultaneous cyclic scheduling and control of a multiproduct CSTR,” Industrial & Engineering Chemistry Research, vol. 45, no. 20, pp. 6698–6712, 2006.
  18. B. O. Koopman, “Hamiltonian systems and transformation in hilbert space,” Proceedings of the National Academy of Sciences, vol. 17, no. 5, pp. 315–318, 1931.
  19. M. Korda and I. Mezić, “Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control,” Automatica, vol. 93, pp. 149–160, 2018.
  20. B. Lusch, J. N. Kutz, and S. L. Brunton, “Deep learning for universal linear embeddings of nonlinear dynamics,” Nature Communications, vol. 9, no. 1, pp. 1–10, 2018.
  21. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “PyTorch: An imperative style, high-performance deep learning library,” Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035, 2019.
  22. A. Agrawal, B. Amos, S. Barratt, S. Boyd, S. Diamond, and J. Z. Kolter, “Differentiable convex optimization layers,” Advances in Neural Information Processing Systems, vol. 32, pp. 9558–9570, 2019.
  23. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  24. “Open Power System Data,” https://data.open-power-system-data.org/time_series/ (accessed on 2022-08-29), 2020.
  25. A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, and N. Dormann, “Stable-baselines3: Reliable reinforcement learning implementations,” Journal of Machine Learning Research, vol. 22, no. 268, pp. 1–8, 2021. [Online]. Available: http://jmlr.org/papers/v22/20-1364.html
Citations (1)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 0 likes.

Upgrade to Pro to view all of the tweets about this paper: