Neural Control: Concurrent System Identification and Control Learning with Neural ODE (2401.01836v4)
Abstract: Controlling continuous-time dynamical systems is generally a two step process: first, identify or model the system dynamics with differential equations, then, minimize the control objectives to achieve optimal control function and optimal state trajectories. However, any inaccuracy in dynamics modeling will lead to sub-optimality in the resulting control function. To address this, we propose a neural ODE based method for controlling unknown dynamical systems, denoted as Neural Control (NC), which combines dynamics identification and optimal control learning using a coupled neural ODE. Through an intriguing interplay between the two neural networks in coupled neural ODE structure, our model concurrently learns system dynamics as well as optimal controls that guides towards target states. Our experiments demonstrate the effectiveness of our model for learning optimal control of unknown dynamical systems. Codes available at https://github.com/chichengmessi/neural_ode_control/tree/main
- Wiley Online Library. Optimal control applications and methods. Wiley Online Library, 2023.
- Pontryagin L Boltyanskii V. Gamkrelidze R. Mishchenko. Mathematical Theory of Optimal Processes. 1961.
- E. McShane. The calculus of variations from the beginning through optimal control theory. SIAM J. Control Optim, 26:916–939, 1989.
- Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013.
- Richard S Sutton. Integrated architectures for learning, planning, and reacting based on approximating dynamic programming. IEEE Transactions on Neural Networks, 4(5):782–791, 1993.
- Ai pontryagin or how artificial neural networks learn to control dynamical systems. Nature Communications, 13(1):333, 2022.
- G. E. Karniadakis and et al. Physics-informed machine learning. Nat. Rev. Phys., 3:422–440, 2021.
- Pontryagin differentiable programming: an end-to-end learning and control framework. In 33rd International Conference on Advances in Neural Information Processing Systems, NeurIPS, 2020.
- Trust region policy optimization. arXiv preprint arXiv:1502.05477, 2015.
- Policy gradient methods for reinforcement learning with function approximation. Advances in neural information processing systems, 12(1):1057–1063, 2000.
- Actor-critic algorithms. Advances in neural information processing systems, 12(1):1008–1014, 2000.
- Model-based reinforcement learning: A survey. 2020.
- Planning with diffusion for flexible behavior synthesis. In International Conference on Machine Learning, May 2022. DOI: 10.48550/arXiv.2205.09991, Corpus ID: 248965046.
- Data-efficient reinforcement learning with self-predictive representations. In International Conference on Learning Representations (ICLR), 2021.
- Recursive time series data augmentation. In International Conference on Learning Representations (ICLR), February 2023. Published: 01 Feb 2023, Last Modified: 13 Feb 2023.
- Neural ordinary differential equations. In Advances in Neural Information Processing Systems, pages 6571–6583, 2018.
- Neural ordinary differential equations. Advances in neural information processing systems, 31:6571–6583, 2018.
- Swagat Kumar. Balancing a cartpole system with reinforcement learning, 2020.
- Razvan V. Florian. Correct equations for the dynamics of the cart-pole system. 2005.
- Cheng Chi. Theory Embedded Learning. PhD thesis, Mechanical and Industrial Engineering, University of Toronto, Nov 2023.