Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Physics-informed neural networks via stochastic Hamiltonian dynamics learning (2111.08108v3)

Published 15 Nov 2021 in math.OC and cs.AI

Abstract: In this paper, we propose novel learning frameworks to tackle optimal control problems by applying the Pontryagin maximum principle and then solving for a Hamiltonian dynamical system. Applying the Pontryagin maximum principle to the original optimal control problem shifts the learning focus to reduced Hamiltonian dynamics and corresponding adjoint variables. Then, the reduced Hamiltonian networks can be learned by going backwards in time and then minimizing loss function deduced from the Pontryagin maximum principle's conditions. The learning process is further improved by progressively learning a posterior distribution of the reduced Hamiltonians. This is achieved through utilizing a variational autoencoder which leads to more effective path exploration process. We apply our learning frameworks called NeuralPMP to various control tasks and obtain competitive results.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (29)
  1. Using inaccurate models in reinforcement learning. In Proceedings of the 23rd international conference on Machine learning, pages 1–8, 2006.
  2. DPO: Differential reinforcement learning with application to optimal configuration search. arXiv preprint arXiv:2404.15617, 2024.
  3. Algorithms for hyper-parameter optimization. In J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 24. Curran Associates, Inc., 2011.
  4. AI pontryagin or how artificial neural networks learn to control dynamical systems. Nat. Commun., 13(1):333, 2022.
  5. OpenAI gym, 2016.
  6. Neural ordinary differential equations. Advances in Neural Information Processing Systems, 2018.
  7. Pilco: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on machine learning (ICML-11), pages 465–472. Citeseer, 2011.
  8. Augmented neural odes. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  9. Pontryagin neural networks with functional interpolation for optimal intercept problems. Mathematics, 9(9), 2021.
  10. Continuous deep Q-learning with model-based acceleration. In International conference on machine learning, pages 2829–2838. PMLR, 2016.
  11. Learning continuous control policies by stochastic value gradients. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015.
  12. Pontryagin differentiable programming: An end-to-end learning and control framework. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 7979–7992. Curran Associates, Inc., 2020.
  13. An introduction to variational autoencoders. Found. Trends® Mach. Learn., 12(4):307–392, 2019.
  14. Donald E Kirk. Optimal Control Theory: An Introduction. Prentice-Hall, London, England, 1971.
  15. Learning neural network policies with guided policy search under unknown dynamics. In NIPS, volume 27, pages 1071–1079. Citeseer, 2014.
  16. Hyperband: a novel bandit-based approach to hyperparameter optimization. J. Mach. Learn. Res., 18(1):6765–6816, Jan 2017.
  17. Maximum principle based algorithms for deep learning. Journal of Machine Learning Research, 18(165):1–29, 2018.
  18. Deep lagrangian networks: Using physics as model prior for deep learning. In International Conference on Learning Representations, 2019.
  19. Asynchronous methods for deep reinforcement learning. In Maria Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 1928–1937, New York, New York, USA, 20–22 Jun 2016. PMLR.
  20. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
  21. The mathematical theory of optimal processes, Translated from the Russian by Trirogoff, K. Interscience Publishers John Wiley & Sons, Inc. New York-London, 1962.
  22. Trust region policy optimization. In Francis Bach and David Blei, editors, Proceedings of the 32nd International Conference on Machine Learning, volume 37 of Proceedings of Machine Learning Research, pages 1889–1897, Lille, France, 07–09 Jul 2015. PMLR.
  23. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  24. Practical bayesian optimization of machine learning algorithms. In F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012.
  25. Richard S Sutton. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. In Machine Learning Proceedings 1990, pages 216–224. Elsevier, 1990.
  26. Emanuel Todorov. Optimal control theory. In Bayesian Brain, pages 268–298. The MIT Press, 2006.
  27. Benchmarking model-based reinforcement learning. arXiv preprint arXiv:1907.02057, 2019.
  28. You only propagate once: Accelerating adversarial training via maximal principle. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019.
  29. Symplectic ODE-Net: Learning hamiltonian dynamics with control. In International Conference on Learning Representations, 2020.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets